EVALUATION INSTRUMENT CRITERIA PAPER

Lecturer: Afdhal Divine, S.Pd.IMPd

ARRANGED BY :

GROUP 10

Nurmaisyah Pulungan : 20140124

Nurul Padilah Sitanggang :20140126

PRIMARY TEACHER EDUCATION

FACULTY OF SOCIAL SCIENCES AND LANGUAGES EDUCATION

SOUTH TAPANULI EDUCATIONAL INSTITUTE

2023

FOREWORD

Thank God for the presence of Allah SWT for all His grace and guidance so that we can complete the "Evaluation Instrument Criteria" paper on time. The purpose of writing this paper is to fulfill the assignment for the Education Evaluation course given by the lecturer Mr. "Afdhal Divine S. Pd.I., M.Pd”.

We would like to thank Mr. Afdhal Divine S.Pd.I., M.Pd. As a supporting lecturer who has given this assignment so that we can increase our knowledge and insight according to the field of study we are pursuing. We would also like to thank all parties who have participated in writing this paper so that we can complete this paper. We realize that the paper we have prepared is still far from perfect. Therefore, we will look forward to criticism and suggestions for the perfection of this paper, which are constructive in nature, so that we can write a better paper. Hopefully this paper can be useful for readers.

Padangsidempuan, 14 May 2023

Presenter,

LIST OF CONTENTS

FOREWORD i

TABLE OF CONTENTS ii

CHAPTER I: INTRODUCTION 1

Background 1

Problem Formulation 2

Objective 2

CHAPTER II: DISCUSSION 2

Evaluation Instrument Criteria 2

Evaluation Instrument Factors 5

CHAPTER III: CLOSING 10

A. Conclusion 10

B. Saran 10

BIBLIOGRAPHY 11

CHAPTER I

INTRODUCTION

A. BACKGROUND.

In the process of evaluating learning or assessing learning processes and outcomes, you often use certain measuring tools, both tests and non-tests. This measuring instrument has a very important function and role in determining the effectiveness of the learning process in schools. Considering the importance of a measuring instrument in learning evaluation activities, a measuring instrument must have certain requirements as well as the characteristics of a good measuring instrument. In practice in madrasas, teachers often make measuring instruments without following certain rules. There are teachers who make measuring instruments (such as test questions or final semester exams) which are taken directly from source books. Even though we know that many source books are not suitable with the syllabus that has been determined. What happens if the questions used do not match the material presented. There are also teachers who use old questions whose quality is not yet known. This is all the result of the teacher's lack of understanding of a good measuring instrument.

Test quality analysis is a stage that must be taken to determine the degree of quality of a test, both as a whole and the questions that are part of the test. In assessing learning outcomes, the test is expected to describe a sample of behavior and produce objective and accurate scores. If the test The tests used by teachers are not good, so the results obtained will certainly not be good. This can be detrimental to the students themselves. This means that the results obtained by students will not be objective and fair. Therefore, the tests used by teachers must be of good quality. It is better to look at it from various aspects. The test should be prepared in accordance with the principles and procedures for preparing the test. After use it is necessary to know whether the test is of good or poor quality. To find out whether a test used is good or not good, it is necessary to analyze the quality of the test . Namely by knowing the criteria for selecting evaluation instruments.

B. Problem Formulation

1. Know the characteristics of evaluation instruments!

2. What are the evaluation instrument criteria?

C. Goals

1. To find out the characteristics of the evaluation instrument.

2. To find out the criteria for the evaluation instrument.

CHAPTER II

DISCUSSION

Evaluation Instrument Criteria

In Evaluation there are Criteria where the word Criteria in question is a measure that is the basis for evaluating or determining something. Criteria can also be defined as a benchmark of nature or characteristics that is determined as a comparison tool for other characteristics. For example, the validity of an Intelligence test is a measurement of intelligence, and about others.

Evaluation Instrument Criteria are measuring and assessing activities. Measuring is comparing something with one measure. While assessing is making a decision about something with a measure of good and bad. In teaching and learning, carrying out evaluation is the teacher, namely the person who plans and carries out teaching and learning activities. The teacher as a figure who always interacts with students requires regular evaluation in order to improve or perfect the learning process being carried out.

Test quality analysis is a stage that must be taken to determine the degree of quality of a test, both as a whole and the questions that are part of the test. In assessing learning outcomes, the test is expected to be able to describe samples of behavior and produce objective and accurate scores. If the test The tests used by teachers are not good, so the results obtained are certainly not good. This can be detrimental to the students themselves. This means that the results obtained by students are not objective and unfair. Therefore, the tests used by teachers must be of good quality. It is better to look at it from various aspects. The test should be prepared in accordance with the principles and procedures for preparing the test. After use it is necessary to know whether the test is of good or poor quality. To find out whether the test used is good or not good, it is necessary to analyze the quality of the test.

stated that the characteristics of a good evaluation instrument are "valid, reliable, relevant, representative, practical, discriminatory".

Valid, meaning that a measuring instrument can be said to be valid if it really measures what it wants to measure accurately. For example, for measuring instruments in the subject of Fiqh, these measuring instruments must truly and only measure students' abilities in studying Fiqh, and must not be mixed with other subject matter. The validity of a measuring instrument can be viewed from various aspects, including predictive validity, comparative validity, content validity, construct validity, etc. You can read an explanation of this validity in the next module description.

Reliable, meaning that a measuring instrument can be said to be reliable or reliable if it has consistent results. For example, if a measuring instrument is given to a group of students now, then given again to the same group of students in the future, and the results turn out to be the same or close to the same, then it can be said that the measuring instrument has a high level of reliability.

Relevant, meaning that the measuring instruments used must be in accordance with the competency standards, basic competencies and indicators that have been determined. Measuring tools must also be appropriate to the learning outcome domains, such as the cognitive, affective and psychomotor domains. Don't want to measure the cognitive domain using non-test measuring instruments. This is of course irrelevant.

Representative, meaning that the measuring instrument material must truly represent all the material presented. This can be done if the teacher uses the syllabus as a reference for selecting test material. Teachers must also pay attention to the material selection process, which material is applicable and which is not, which is important and which is not.

Practical, meaning easy to use. If the measuring instrument meets the requirements but is difficult to use, it means it is not practical. This practicality is not only seen by measuring instrument makers (teachers), but also by other people who want to use these measuring instruments.

Discriminative, meaning that the measuring instrument must be arranged in such a way that it can show even the slightest differences. The better a measuring instrument, the more capable the measuring instrument is of showing differences accurately. To find out whether a measuring instrument is sufficiently discriminatory or not, it is usually based on a test of the measuring instrument's discriminating power.

Specific, meaning that an instrument is prepared and used specifically for the object being evaluated. If the instrument uses a test, the test answers should not give rise to ambivalence or speculation.

Proportional, meaning that an instrument must have a proportional level of difficulty between difficult, medium and easy.

Evaluation Instrument Criteria must be carried out continuously (Continuously). With repeated evaluations, the teacher will get a clearer picture of the student's condition. Tests that are held one the spot and only once or twice, will not be able to provide objective results about the student's condition. The chance factor will really affect the results. There is a child who is actually good, but when the teacher gives the test he is in bad condition because he spent the night caring for his sick mother, so there is a possibility that his test score will be bad too.

Evaluation must be carried out comprehensively (thoroughly). What is meant by comprehensive evaluation here is in terms of review, namely:

Covers all material.

Includes aspects of thinking (memory, understanding, application, and so on).

Through various methods, namely written tests, oral tests, action tests, incidental observations, and so on.

Validity

Before you use a test, you should first measure the degree of validity based on certain criteria. In other words, to see whether the test is valid, you have to compare the student's score obtained in the test with the score that is considered the standard value. For example, a student's final semester exam score in one subject is compared with the final semester exam score in another subject. The closer the two scores are, the more the final exam questions can be said to be valid. The validity of a test is closely related to the purpose of using the test. However, there is no validity that applies in general. This means, if a test can provide appropriate and reliable information. used to achieve a certain goal, then the test is valid for that purpose.

There are two important elements in this validity. First, validity shows a degree, some are perfect, some are moderate, and some are low. Second, validity is always connected to a specific decision or goal. As argued by RL Thorndike and HP Hagen (1977: 56) that "validity is always in relation to a specific decision or use". Meanwhile, Gronlund (1985: 79-81) in the book Learning Evaluation by Zainal Arifin found that there are three factors that influence the validity of test results, namely "evaluation instrument factors, evaluation and scoring administration factors, and factors from students' answers".

Evaluation Instrument Factors

Developing an evaluation instrument is not easy, especially if an evaluator does not or does not understand the evaluation procedures and techniques themselves. If the evaluation instrument is not good, it can result in poor evaluation results. For this reason, in developing an evaluation instrument, an evaluator must pay attention to matters that affect the validity of the instrument and are related to the procedures for preparing the instrument, such as the syllabus, question grid, instructions for working on questions and filling out answer sheets, answer keys, use of effective sentences, form alternative answers, level of difficulty, distinguishing power, and so on.

Factors in students' answers

In practice, students' answers actually have more influence than the previous two factors. These factors include students' tendency to answer quickly but inaccurately, the desire to try and try, and the use of certain language styles in answering questions in the form of descriptions.

Surface validity

This validity uses very simple criteria, because it only looks at the face or appearance of the instrument itself. This means that if a test at first glance is considered good for revealing the phenomenon to be measured, then the test can be said to meet the requirements for surface validity, so there is no need for in-depth judgment.

Content validity

Content validity is often used in measuring learning outcomes. The main aim is to find out to what extent students have mastered the lesson material that has been presented, and what psychological changes arise in these students after experiencing a particular learning process. When viewed from the perspective of its use in assessing learning outcomes, content validity is often also called curricular validity and formulation validity. Curricular validity concerns the question of whether the test material is relevant to the specified curriculum.

This question arises because it often happens that the test material does not cover all the aspects to be measured, including cognitive, affective and psychomotor aspects, but only knowledge of certain lesson facts. It is hoped that with this curricular validity, clear accuracy and totality will emerge by exploring all aspects covered in the grid and the relevant Learning Implementation Plan (RPP). This curricular validity can be done in several ways, including matching the test material with the syllabus and grid, holding discussions with fellow educators, or looking again at the substance of the concept to be measured.

a specific pattern of criteria that is correlated with the results of that test. In relation to special criteria, Anastasi in Conny Semiawan Stamboel (1986: 50), suggests that there are eight criteria as comparative material to formulate what a test will investigate, namely "age differentiation, academic progress, criteria for implementing special training, criteria for implementing work, assessment, contrast groups, correlation with other tests, and internal consistency”.

The most important criterion in the evaluation instrument is age. Most intelligence tests, both those used in madrasas and pre-madrasah tests, are always compared with chronological age to determine whether the numbers increase with increasing age. If a test is considered valid, then the test scores for students will increase with increasing age. However, this assumption does not apply to the development of all functions in relation to increasing age consistently (this is proven by several personality tests). It is also said that this applies to the style of each culture.

In general, intelligence tests are validated by academic progress. It is also often said that the longer someone studies at school, the higher their education, the higher their academic progress. In fact, each type and level of education is selective. For students who are unable to continue, usually including dropout. However, there are also many non-intellectual factors that influence a student's educational success. In other words, the success or failure of a person's education is not only seen from intellectual factors but also non-intellectual factors. To obtain a comprehensive and holistic picture of this, further investigation is necessary.

In the Evaluation Instrument there are criteria for implementing special training. The criteria for developing special aptitude tests are based on achievements in certain specific training. Several professional aptitude tests have been validated with learning outcomes tests in these fields. For example, tests to enter the profession of medicine, law, and so on. There are several tests to enter certain fields which are called tailor-made tests, namely tests that have been created specifically for this purpose, such as flight tests.

In the Evaluation Instrument Criteria there is an Assessment. The meaning of assessment here is a technique for obtaining information about the learning progress of students in madrasas. In addition, it also includes work that requires special training or success in the personal assessment by an observer of various psychological functions. For example, conditions, originality, leadership, or honesty. If the recognition conditions are in a situation where a special ability is expressed, then it needs to be accompanied by an assessment scale that is prepared carefully. And in the Evaluation Instrument Criteria there is also a Correlation with other tests. The correlation between the new test and the old test is a comparison of criteria in investigating existing behavior. The same. In this case, a written verbal test can be compared with an individual test or a group test. To measure whether a new test has validity and is free from the influence of other factors, other types of tests are used to compare them. So, sometimes personality tests are correlated with internal tests or learning outcomes tests. The scores obtained during the tests are also found in the tests, not describing participation but describing learning achievement. Actually, the discussion of validity is not emphasized on the test itself but on the test results and from the results of experience.

Messick (1993) states that validity traditionally consists of

Content validity, namely the consistency of the material measured in the test;

Criterion-related validity, namely comparing tests with one or more criteria;

Predictive Validity, namely the determination of measurement results using other tools which are carried out later;

CHAPTER III

CLOSING

CONCLUSION

In relation to special criteria, Anastasi in Conny Semiawan Stamboel (1986: 50), suggests that there are eight criteria as comparative material to formulate what a test will investigate, namely "age differentiation, academic progress, criteria for implementing special training, criteria for implementing work, assessment, contrast groups, correlation with other tests, and internal consistency”.

If the recognition conditions are in a situation where a special ability is expressed, then it needs to be accompanied by an assessment scale that is prepared carefully. And in the Evaluation Instrument Criteria there is also a Correlation with other tests. The correlation between the new test and the old test is a comparison of criteria in investigating existing behavior. The same.

SUGGESTION

We, the presenters, realize that there are still many shortcomings in this paper. For this reason, we ask for criticism and suggestions for making better papers and materials. We hope that this paper can broaden the insight of every reader and can be useful for all of us.

BIBLIOGRAPHY

http://sriwildaningsih22.blogspot.com/2018/01/kriteria-intumen-produk.html accessed 24 May 2023

Stufflebeam, DL 1974b. Meta evaluation. (Paper No. 3). Kalamazoo, ML: Western Michigan

University Evaluation Center.

Isaac, S and Michael, W.B.1981.Handbook in Research and Evaluation. San Diego,C. A,: Edits.

Yang asked

1. Ainun Siregar

Evaluation tools that are said to have good quality can be viewed from what perspective?

Answer: A good evaluation tool can be viewed from the following points, namely validity, reliability, distinguishing power, degree of difficulty, option effectiveness, objectivity and practicability. An evaluation tool is said to be valid if the tool is able to evaluate what should be evaluated by a teacher.

2. Mutiara Tapsel Siregar

Why do researchers have to prepare instruments in their research? What are their uses and benefits?

Answer: Because research instruments are tools that are used to obtain research. without instruments we would not be able to collect the data needed for research. if the data does not exist. then the research will not be able to be carried out.

3. Nur Aisyah

What to do if some questions are invalid, do they need to be replaced or just use valid questions?

Answer: In our opinion, just throw it away because if any item has indicator validation. Therefore, make questions with at least 2 items or indicators. If the invalid items are considered unimportant then they need to be corrected after they have been replaced and an external validity test needs to be carried out again

4. Rachel Intan Agustina

If the instrument can measure, what should it measure so that it can be said that the instrument has been validated

Answer: If an instrument is said to be valid, it means that the instrument can be used to measure what it is supposed to measure. A reliable instrument is an instrument that, when used several times to measure the same object, will produce the same data

EVALUATION INSTRUMENT CRITERIA PAPER

Materi Terkait:

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel