JOMO KENYATTA UNIVERSITY OF AGRICULTURE AND TECHNOLOGY. INOORERO UNIVERSITY RESEARCH METHODOLOGY : ROTICH BENARD KIPKEMOI BOC-008-0312/2007 DATA COLLECTION METHODS Methods of data collection. The term data means groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived. Data can be classified into primary and secondary data.
In order to carry out research on a particular subject a researcher needs to collect data fro carrying out a research. Data is of two forms either primary or secondary. Primary data is the information that is collected first hand by researchers. A primary data source is something that originates from first-hand knowledge of the person referenced in the data or from a first-hand witness it is the data that is collected under the control and supervision of an investigator . Some examples of primary data are; ? data of a study to determine the morale of the employees in company ? Surveys; ? Interviews; ? Focus groups; Questionnaires A secondary data source means that the information is simply second-hand. Secondary data is the information that is already available and which is used by the researcher as source for data used in his/her research. Different forms of secondary data include: • Journals; • Books; • Census Data; • Newspaper articles; and • Biographies The distinction between primary and secondary data is only relative . The primary data of one study will serve as secondary data of another study. For example the census data of a country is an example of primary data to find the size of its population .
This in turn will serve as data to plan public distribution system of each taluk of different districts in each state of the country. The same data under a situation is secondary data. PRIMARY DATA. The different methods that are used for primary data collection are; observation method, personal interview, telephone interview and questionnaires, mail survey, case study method. Observation method In observation the investigator will collect data through personal observation. Consider an example in which an investigator collects data about the organization climate in an organization through direct observation.
Observation involves recording the behavioural patterns of people, objects and events in a systematic manner. Observational methods may be: • structured or unstructured • disguised or undisguised • natural or contrived • personal • mechanical • non-participant • participant, with the participant taking a number of different roles. Structured or unstructured In structured observation, the researcher specifies in detail what is to be observed and how the measurements are to be recorded. It is appropriate when the problem is clearly defined and the information needed is specified.
In unstructured observation, the researcher monitors all aspects of the phenomenon that seem relevant. It is appropriate when the problem has yet to be formulated precisely and flexibility is needed in observation to identify key components of the problem and to develop hypotheses. The potential for bias is high. Observation findings should be treated as hypotheses to be tested rather than as conclusive findings. Disguised or undisguised In disguised observation, respondents are unaware they are being observed and thus behave naturally.
Disguise is achieved, for example, by hiding, or using hidden equipment or people disguised as shoppers. In undisguised observation, respondents are aware they are being observed. There is a danger of the Hawthorne effect – people behave differently when being observed. Natural or contrived Natural observation involves observing behaviour as it takes place in the environment, for example, eating hamburgers in a fast food outlet. In contrived observation, the respondents’ behaviour is observed in an artificial environment, for example, a food tasting session. Personal
In personal observation, a researcher observes actual behaviour as it occurs. The observer may or may not normally attempt to control or manipulate the phenomenon being observed. The observer merely records what takes place. Mechanical Mechanical devices (video, closed circuit television) record what is being observed. These devices may or may not require the respondent’s direct participation. They are used for continuously recording on-going behaviour. Non-participant The observer does not normally question or communicate with the people being observed. He or she does not participate.
Participant In participant observation, the researcher becomes, or is, part of the group that is being investigated. Participant observation has its roots in ethnographic studies (study of man and races) where researchers would live in tribal villages, attempting to understand the customs and practices of that culture. It has a very extensive literature, particularly in sociology (development, nature and laws of human society) and anthropology (physiological and psychological study of man). Organizations can be viewed as ‘tribes’ with their own customs and practices. Advantages )Helps capture the behavior of customers directly. Disadvantages; 1)Time consuming and costly exercise. 2) Personal biasness of investigators will distort the findings. Interview Interviewing is a technique that is primarily used to gain an understanding of the underlying reasons and motivations for people’s attitudes, preferences or behavior. Interviews can be undertaken on a personal one-to-one basis or in a group. They can be conducted at work, at home, in the street or in a shopping center, or some other agreed location. Interviews can be personal interviews or telephone interview.
Personal interview is a survey method of data collection which employs a questionnaire. The components of the personal interview are the researcher, the interviewer, interviewee and the interview environment. Types of interview Structured: • Based on a carefully worded interview schedule. • Frequently require short answers with the answers being ticked off. • Useful when there are a lot of questions which are not particularly contentious or thought provoking. • Respondent may become irritated by having to give over-simplified answers. Semi-structured
The interview is focused by asking certain questions but with scope for the respondent to express him or herself at length. Unstructured This also called an in-depth interview. The interviewer begins by asking a general question. The interviewer then encourages the respondent to talk freely. The interviewer uses an unstructured format, the subsequent direction of the interview being determined by the respondent’s initial reply. The interviewer then probes for elaboration – ‘Why do you say that? ’ or, ‘That’s interesting, tell me more’ or, ‘Would you like to add anything else? being typical probes. The following section is a step-by-step guide to conducting an interview. You should remember that all situations are different and therefore you may need refinements to the approach. Personal interview Advantages: • Serious approach by respondent resulting in accurate information. • Good response rate. • Completed and immediate. • Possible in-depth questions. • Interviewer in control and can give help if there is a problem. Disadvantages: • Need to set up interviews. • Time consuming. • Geographic limitations. • Can be expensive. • Normally need a set of questions.
Telephone interview This is an alternative form of interview to the personal, face-to-face interview. Advantages: • Relatively cheap. • Quick. • Can cover reasonably large numbers of people or organizations. • Wide geographic coverage. Disadvantages: • Often connected with selling. • Questionnaire required. • Not everyone has a telephone. • Repeat calls are inevitable – average 2. 5 calls to get someone. • Time is wasted. Questionnaires Questionnaires are a popular means of collecting data, but are difficult to design and often require many rewrites before an acceptable questionnaire is produced.
Questionnaire consist of a set of well –formulated questions to probe and obtain responses from respondent s. Types of questions Closed questions A question is asked and then a number of possible answers are provided for the respondent. The respondent selects the answer which is appropriate. Closed questions are particularly useful in obtaining factual information: Sex: Male [ ] Female [ ] Did you watch television last night? Yes [ ] No [ ] Frequently questions are asked to find out the respondents’ opinions or attitudes to a given situation. A Like rt scale provides a battery of attitude statements.
The respondent then says how much they agree or disagree with each one: Read the following statements and then indicate by a tick whether you strongly agree, agree, disagree or strongly disagree with the statement. | |Strongly agree |Agree |Disagree |Strongly | | | | | |disagree | |My visit has been | | | | | |good value for money | | | | | Open questions An open question such as ‘What are the essential skills a manager should possess? should be used as an adjunct to the main theme of the questionnaire and could allow the respondent to elaborate upon an earlier more specific question. Open questions inserted at the end of major sections, or at the end of the questionnaire, can act as safety valves, and possibly offer additional information. Advantages: • Can be used as a method in its own right or as a basis for interviewing or a telephone survey. • Can be posted, e-mailed or faxed. • Can cover a large number of people or organizations. • Wide geographic coverage. • Relatively cheap. Disadvantages: • Design problems. Questions have to be relatively simple. • Historically low response rate (although inducements may help). • Time delay whilst waiting for responses to be returned. • Require a return deadline. • Several reminders may be required. • Assumes no literacy problems. • No control over who completes it. • Not possible to give assistance if required Case study. The term case-study usually refers to a fairly intensive examination of a single unit such as a person, a small group of people, or a single company. Case-studies involve measuring what is there and how it got there. In this sense, it is historical.
It can enable the researcher to explore, unravel and understand problems, issues and relationships. It cannot, however, allow the researcher to generalize, that is, to argue that from one case-study the results, findings or theory developed apply to other similar case-studies. The case looked at may be unique and, therefore not representative of other instances. It is, of course, possible to look at several case-studies to represent certain features of management that we are interested in studying. The case-study approach is often done to make practical improvements. Contributions to general knowledge are incidental.
The case-study method has four steps: 1. Determine the present situation. 2. Gather background information about the past and key variables. 3. Test hypotheses. The background information collected will have been analyzed for possible hypotheses. In this step, specific evidence about each hypothesis can be gathered. This step aims to eliminate possibilities which conflict with the evidence collected and to gain confidence for the important hypotheses. The culmination of this step might be the development of an experimental design to test out more rigorously the hypotheses developed, or it might be to take action to remedy the problem. . Take remedial action. The aim is to check that the hypotheses tested actually work out in practice. Some action, correction or improvement is made and a re-check carried out on the situation to see what effect the change has brought about. The case-study enables rich information to be gathered from which potentially useful hypotheses can be generated. It can be a time-consuming process. It is also inefficient in researching situations which are already well structured and where the important variables have been identified.
They lack utility when attempting to reach rigorous conclusions or determining precise relationships between variables. Assumptions: I. Uniformity in the basic human nature in spite of the fact that human behavior may vary according to situations. II. The assumption of studying the natural history of the unit concerned. III. The comprehensive study of the unit concerned. Advantages I. Good source of ideas about behavior II. Good opportunity for innovation III. Good method to study a rare phenomenon IV. Good method to challenge theoretical assumptions V. Good alternative or complements to the group focus of psychology.
Disadvantages I. Hard to draw definite cause effect conclusion II. Hard to generalize from a single case III. Possible biases in data collection and interpretation IV. Can be very inaccurate if done poorly. SECONDARY DATA All methods of data collection can supply quantitative data (numbers, statistics or financial) or qualitative data (usually words or text). Quantitative data may often be presented in tabular or graphical form. Secondary data is data that has already been collected by someone else for a different purpose to yours. Sources can be classified as: paper-based sources – books, journals, periodicals, abstracts, indexes, directories, research reports, conference papers, market reports, annual reports, internal records of organizations, newspapers and magazines • electronic sources– CD-ROM’s, on-line databases, Internet, videos and broadcasts. The main sources of qualitative and quantitative secondary data include the following: • Official or government sources. • Unofficial or general business sources Secondary sources of information may be divided into two categories: internal sources and external sources. Internal sources of secondary information
Sales data : All organizations collect information in the course of their everyday operations. Orders are received and delivered, costs are recorded, sales personnel submit visit reports, invoices are sent out, returned goods are recorded and so on. Much of this information is of potential use in marketing research but a surprising amount of it is actually used. Organization frequently overlook this valuable resource by not beginning their search of secondary sources with an internal audit of sales invoices, orders, inquiries about products not stocked, returns from customers and sales force customer calling sheets.
For example, consider how much information can be obtained from sales orders and invoices: • Sales by territory • Sales by customer type • Prices and discounts • Average size of order by customer, customer type, geographical area • Average sales by sales person and • Sales by pack size and pack type, etc. This type of data is useful for identifying an organization’s most profitable product and customers. It can also serve to track trends within the enterprise’s existing customer group.
Financial data: An organization has a great deal of data within its files on the cost of producing, storing, transporting and marketing each of its products and product lines. Such data has many uses in marketing research including allowing measurement of the efficiency of marketing operations. It can also be used to estimate the costs attached to new products under consideration, of particular utilization (in production, storage and transportation) at which an organization’s unit costs begin to fall.
Transport data: Companies that keep good records relating to their transport operations are well placed to establish which are the most profitable routes, and loads, as well as the most cost effective routing patterns. Good data on transport operations enables the enterprise to perform trade-off analysis and thereby establish whether it makes economic sense to own or hire vehicles, or the point at which a balance of the two gives the best financial outcome. Storage data: The rate of stock turn, stock handling costs, assessing the efficiency of certain marketing operations and the efficiency of the marketing system as a whole.
More sophisticated accounting systems assign costs to the cubic space occupied by individual products and the time period over which the product occupies the space. These systems can be further refined so that the profitability per unit, and rate of sale, are added. In this way, the direct product profitability can be calculated. External sources of secondary data 1) Government statistics:These include policy papers and research reports owned by the government, some which are sponsored by international agencies. These may include all or some of the following: · Population censuses Social surveys, family expenditure surveys · Import/export statistics · Production statistics · Agricultural statistics. 2) Trade associations: Trade associations differ widely in the extent of their data collection and information dissemination activities. However, it is worth checking with them to determine what they do publish. At the very least one would normally expect that they would produce a trade directory and, perhaps, a yearbook. 3) Commercial services: Published research reports and other publications are available from a wide range of organizations which charge for their information.
Typically, marketing people are interested in media statistics and consumer information which has been obtained from large scale consumer or farmer panels. The commercial organization funds the collection of the data, which is wide ranging in its content, and hopes to make its money from selling this data to interested parties. 4) National and international institutions: Bank economic reviews, university research reports, journals and articles are all useful sources to contact.
International agencies such as World Bank, IMF, UNDP, ITC and FAO a produce a plethora of secondary data which can prove extremely useful to the researcher. Problems of secondary sources Definitions The researcher has to be careful, when making use of secondary data, of the definitions used by those responsible for its preparation. Suppose, for example, researchers are interested in rural communities and their average family size. If published statistics are consulted then a check must be done on how terms such as “family size” have been defined.
They may refer only to the nucleus family or include the extended family. Even apparently simple terms such as ‘farm size’ need careful handling. Such figures may refer to any one of the following: the land an individual owns the land an individual owns plus any additional land he/she rents, the land an individual owns minus any land he/she rents out, all of his land or only that part of it which he actually cultivates. It should be noted that definitions may change over time and where this is not recognized erroneous conclusions may be drawn.
Geographical areas may have their boundaries redefined, units of measurement and grades may change and imported goods can be reclassified from time to time for purposes of levying customs and excise duties. Measurement error When a researcher conducts fieldwork she/he is possibly able to estimate inaccuracies in measurement through the standard deviation and standard error, but these are sometimes not published in secondary sources. The only solution is to try to speak to the individuals involved in the collection of the data to obtain some guidance on the level of accuracy of the data.
The problem is sometimes not so much ‘error’ but differences in levels of accuracy required by decision makers. When the research has to do with large investments in, say, food manufacturing, and management will want to set very tight margins of error in making market demand estimates. In other cases, having a high level of accuracy is not so critical. For instance, if a food manufacturer is merely assessing the prospects for one more flavour for a snack food already produced by the company then there is no need for highly accurate estimates in order to make the investment decision.
Source bias Researchers have to be aware of vested interests when they consult secondary sources. Those responsible for their compilation may have reasons for wishing to present a more optimistic or pessimistic set of results for their organisation. It is not unknown, for example, for officials responsible for estimating food shortages to exaggerate figures before sending aid requests to potential donors. Similarly, and with equal frequency, commercial organizations have been known to inflate estimates of their market shares.
Reliability The reliability of published statistics may vary over time. It is not uncommon, for example, for the systems of collecting data to have changed over time but without any indication of this to the reader of published statistics. Geographical or administrative boundaries may be changed by government, or the basis for stratifying a sample may have altered. Other aspects of research methodology that affect the reliability of secondary data are the sample size, response rate, questionnaire design and modes of analysis. Time scale
Most censuses take place at 10 year intervals, so data from this and other published sources may be out-of-date at the time the researcher wants to make use of the statistics. The time period during which secondary data was first compiled may have a substantial effect upon the nature of the data. For instance, the significant increase in the price obtained for Ugandan coffee in the mid-90’s could be interpreted as evidence of the effectiveness of the rehabilitation programme that set out to restore coffee estates which had fallen into a state of disrepair.
However, more knowledgeable coffee market experts would interpret the rise in Ugandan coffee prices in the context of large scale destruction of the Brazilian coffee crop, due to heavy frosts, in 1994, Brazil being the largest coffee producer in the world. Methods we use in controlling data; Reliability: Reliability is the extent to which the same finding will be obtained if the research was repeated at another time by another researcher. If the same finding can be obtained again, the instrument is consistent or reliable.
Example |a) |A clock which is fast one day and slow the next day is unreliable. | |b) |Questions which ask for an opinion may produce different answers on different | | |days because in the meantime people may have watched television or read | | |newspapers and changed their opinions. The questions are unreliable. | If some form of questioning is used to obtain data, an assessment of the reliability should be conducted by a critical assessment of, for example: • use of non-standard instructions • questions themselves response sets – order, wording, other alternatives Questions about people’s opinions and attitudes always potentially suffer from low reliability because they may change over a period of time depending on what they have heard, seen or read. Validity Validity is epitomized by the question: ‘Are we measuring what we think we are measuring? ’ This is very difficult to assess. The following questions are typical of those asked to assess validity issues: • Has the researcher gained full access to the knowledge and meanings of informants? • Would experienced researchers use the same questions or methods?
No procedure is perfectly reliable, but if a data collection procedure is unreliable then it is also invalid, but if it is reliable then it is not necessarily valid. Triangulation Triangulation is crosschecking of data using multiple data sources or using two or more methods of data collection. There are different types of triangulation, including: • time triangulation – longitudinal studies • methodological triangulation – same method at different times or different methods on same object of study • Investigator triangulation – uses more than one researcher. Sampling error
Sampling error is a measure of the difference between the sample results and the population parameters being measured. It can never be eliminated, but if random sampling is used, sampling error occurs by chance but is reduced as the sample size increases. When non-random sampling is used this is not the case. Basic questions we need to ask to assess a sample are: • Is the sample random and representative of the population? • Is the sample small or large? Non-sampling error All errors, other than sampling errors, are non-sampling errors and can never be eliminated.
The many sources of non-sampling errors include the following: • Researcher error – unclear definitions; reliability and validity issues; data analysis problems, for example, missing data. • Interviewer error – general approach; personal interview techniques; recording responses. • Respondent error – inability to answer; unwilling; cheating; not available; low response rate. Ethics in data collection in research Research ethics involves the application of fundamental ethical principles to a variety of topics involving scientific research.
Ethics are norms for conduct that distinguish between or acceptable and unacceptable behavior. Ethical norms also serve the aims or goals of research and apply to people who conduct scientific research or other scholarly or creative activities, and there is a specialized discipline, research ethics, which studies these norms. Codes and Policies for Research Ethics Given the importance of ethics for the conduct of research, it should come as no surprise that many different professional associations, government agencies, and universities have adopted specific codes, rules, and policies relating to research ethics.
Objectivity Strive to avoid bias in experimental design, data analysis, data interpretation, peer review, personnel decisions, grant writing, expert testimony, and other aspects of research where objectivity is expected or required. Avoid or minimize bias or self-deception. Disclose personal or financial interests that may affect research. Integrity Keep your promises and agreements; act with sincerity; strive for consistency of thought and action. Carefulness Avoid careless errors and negligence; carefully and critically examine your own work and the work of your peers.
Keep good records of research activities, such as data collection, research design, and correspondence with agencies or journals. Openness Share data, results, ideas, tools, resources. Be open to criticism and new ideas. Respect for Intellectual Property Honor patents, copyrights, and other forms of intellectual property. Do not use unpublished data, methods, or results without permission. Give credit where credit is due. Give proper acknowledgment or credit for all contributions to research. Never plagiarize. Confidentiality
Protect confidential communications, such as papers or grants submitted for publication, personnel records, trade or military secrets, and patient records. Responsible Publication Publish in order to advance research and scholarship, not to advance just your own career. Avoid wasteful and duplicative publication. Responsible Mentoring Help to educate, mentor, and advise students. Promote their welfare and allow them to make their own decisions. Respect for colleagues Respect your colleagues and treat them fairly. Social Responsibility
Strive to promote social good and prevent or mitigate social harms through research, public education, and advocacy. Non-Discrimination Avoid discrimination against colleagues or students on the basis of sex, race, ethnicity, or other factors that are not related to their scientific competence and integrity. Competence Maintain and improve your own professional competence and expertise through lifelong education and learning; take steps to promote competence in science as a whole. Legality Know and obey relevant laws and institutional and governmental policies. Animal Care Show proper respect and care for animals when using them in research.
Do not conduct unnecessary or poorly designed animal experiments. Human Subjects Protection When conducting research on human subjects minimize harms and risks and maximize benefits; respect human dignity, privacy, and autonomy; take special precautions with vulnerable populations; and strive to distribute the benefits and burdens of research fairly. There are many other activities that the government does not define as “misconduct” but which are still regarded by most researchers as unethical. These are sometimes called “other deviations” from acceptable research practices.
Some of these might include: • Publishing the same paper in two different journals without telling the editors • Submitting the same paper to different journals without telling the editors • Not informing a collaborator of your intent to file a patent in order to make sure that you are the sole inventor • Including a colleague as an author on a paper in return for a favor even though the colleague did not make a serious contribution to the paper • Discussing with your colleagues data from a paper that you are reviewing for a journal • Trimming outliers from a data set without discussing your reasons in paper • Using an inappropriate statistical technique in order to enhance the significance of your research There are several reasons why it is important to adhere to ethical norms in research. First, some of these norms promote the aims of research, such as knowledge, truth, and avoidance of error. For example, prohibitions against fabricating, falsifying, or misrepresenting research data promote the truth and avoid error.
Second, since research often involves a great deal of cooperation and coordination among many different people in different disciplines and institutions, many of these ethical standards promote the values that are essential to collaborative work, such as trust, accountability, mutual respect, and fairness. For example, many ethical norms in research, such as guidelines for authorship, copyright and patenting policies, data sharing policies, and confidentiality rules in peer review, are designed to protect intellectual property interests while encouraging collaboration. Most researchers want to receive credit for their contributions and do not want to have their ideas stolen or disclosed prematurely. Third, many of the ethical norms help to ensure that researchers can be held accountable to the public.
For instance, federal policies on research misconduct, on conflicts of interest, on the human subjects protections, and on animal care and use are necessary in order to make sure that researchers who are funded by public money can be held accountable to the public. Fourth, ethical norms in research also help to build public support for research. People more likely to fund research project if they can trust the quality and integrity of research. Finally, many of the norms of research promote a variety of other important moral and social values, such as social responsibility, human rights and animal welfare, compliance with the law, and health and safety.
Ethical lapses in research can significantly harm to human and animal subjects, students, and the public. For example, a researcher who fabricates data in a clinical trial may harm or even kill patients and a researcher who fails to abide by regulations and guidelines relating to radiation or biological safety may jeopardize his health and safety or the health and safety of staff and students. References; 1). Green, P. E. Tull, D. S. and Albaum G (1993) Research methods for marketing decisions, 5th edition, Prentice Hall, p. 136 2) Shamoo A and Resnik D. 2003. Responsible Conduct of Research (New York: Oxford University Press). 3) R. Panneer selvan. Research Methodology, Prentice Hall of India.