1st Assignment for PhD Scholar Research Methodology and Computer Application
…… University, ……. 1st Assignment for Ph. D Scholar Research Methodology and Computer
Application |
|
|
|
|
1. Explain (Answer in 50 words only)
A. Primary Data
Primary data
are those that you have collected yourself, whereas secondary data originate
elsewhere. Generally, you will find that you are expected to collect primary
data when using quantitative methods, but that secondary data are more
acceptable when you are using a qualitative method. This is because there are
certain common aspects of qualitative research which involve only secondary
data, such as the study of television or newspaper discourses. If you wanted to
understand the nature of the representation of Romany people on television, you
wouldn’t make your own television programmes! You would use those which exist,
and they would form [your] secondary data
B.Secondary Data
Secondary data is data collected by someone other than the user. Common sources of secondary data for social science include censuses, surveys, organizational records and data collected through qualitative methodologies or qualitative research. Primary data, by contrast, are collected by the investigator conducting the research.
Secondary data analysis saves time that would otherwise be spent collecting data and, particularly in the case of quantitative data, provides larger and higher-quality databases that would be unfeasible for any individual researcher to collect on their own. In addition, analysts of social and economic change consider secondary data essential, since it is impossible to conduct a new survey that can adequately capture past change and/or developments.
C. Sampling Errors and Treatment
Sampling
error is the deviation of the selected sample from the true characteristics, traits,
behaviors, qualities or figures of the entire population. In
statistics, sampling error
or estimation error is the amount of inaccuracy in estimating some value
that is caused by only a portion of a population (i.e. a sample) rather
than the whole population. This amount of inaccuracy is commonly referred to as
an error.
Sampling error can be measured and quoted in many different ways, but in
practice the reported error itself is almost always an estimate of the real
error rather than an absolute measure of the error (which would usually require
analyzing the entire population).
D.Type I error
A type I error, also known as an error of the first kind, is the wrong decision that is made when a
test rejects a true null hypothesis (H0). A type I error may be compared
with a so called false positive in other test situations. Type I
error can be viewed as the error of excessive credulity.[1] In terms of folk tales, an investigator may be "crying
wolf" (raising a false alarm) without a wolf in sight (H0:
no wolf).
The rate of the type I error is
called the size of the test and denoted by the Greek letter α (alpha).
It usually equals the significance level of a test. In the case of a simple
null hypothesis α is the probability of a type I error. If the null
hypothesis is composite, α is the maximum (supremum) of the possible
probabilities of a type I error.
E.Type II error
A type II error, also known as an error of the second kind, is the wrong decision that is made when a test fails to reject a false null hypothesis. A type II error may be compared with a so-called false negative in other test situations. Type II error can be viewed as the error of excessive skepticism. In terms of folk tales, an investigator may fail to see the wolf ("failing to raise an alarm"; see Aesop's story of The Boy Who Cried Wolf). Again, H0: no wolf.
The rate of the type II error is denoted by the Greek letter β (beta) and related to the power of a test (which equals 1 − β).
What we actually call type I or type II error depends directly on the null hypothesis. Negation of the null hypothesis causes type I and type II errors to switch roles.
The goal of the test is to determine if the null hypothesis can be rejected. A statistical test can either reject (prove false) or fail to reject (fail to prove false) a null hypothesis, but never prove it true (i.e., failing to reject a null hypothesis does not prove it true).
2.Explain
the meaning of:-
A. Statistical
Analysis
This term refers to a
wide range of techniques to describe, explore, understand, prove, predict, etc.
based on sample datasets collected from populations, using some sampling
strategy. It is a collection of methods used to process large amounts of
data and report overall trends. Statistical analysis is particularly
useful when dealing with noisy data. Statistical analysis provides ways
to objectively report on how unusual an event is based on historical data.
B.Probability
Theories
Probability theory is
that part of mathematics that aims to provide insight into phenomena that
depend on chance or on uncertainty. The most prevalent use of the theory comes
through the frequentists’ interpretation of probability in terms of the
outcomes of repeated experiments, but probability is also used to provide a
measure of subjective beliefs, especially as judged by one’s willingness to
place bets.
C.
Hypothesis Tests
Setting up and testing hypotheses is
an essential part of statistical inference. In order to formulate such a test,
usually some theory has been put forward, either because it is believed to be
true or because it is to be used as a basis for argument, but has not been
proved, for example, claiming that a new drug is better than the current drug
for treatment of the same symptoms.
In each problem considered, the
question of interest is simplified into two competing claims / hypotheses
between which we have a choice; the null hypothesis, denoted H0, against the
alternative hypothesis, denoted H1. These two competing claims / hypotheses are
not however treated on an equal basis: special consideration is given to the
null hypothesis.
The hypotheses are often statements
about population parameters like expected value and variance; for example H0
might be that the expected value of the height of ten year old boys in the
Scottish population is not different from that of ten year old girls. A
hypothesis might also be a statement about the distributional form of a
characteristic of interest, for example that the height of ten year old boys is
normally distributed within the Scottish population.
D.
Sample Test
In statistics and survey methodology, sampling is concerned with
the selection of a subset of individuals from within a population to estimate characteristics of the
whole population. Researchers rarely survey the entire population because the
cost of a census is too high. The three main advantages of sampling are that
the cost is lower, data collection is faster, and since the data set is
smaller. It is possible to ensure homogeneity and to improve the accuracy and
quality of the data. Each observation measures one or more properties (such as weight, location,
color) of observable bodies distinguished as independent objects or
individuals. In survey sampling, weights can be applied to the data
to adjust for the sample design, particularly stratified sampling (blocking). Results from probability theory and statistical theory are employed to guide practice. In
business and medical research, sampling is widely used for gathering
information about a population.
E.Formula
of :-
a)
Chi-Square Test
The Chi Square (X2) test is
undoubtedly the most important and most used member of the nonparametric family
of statistical tests. Chi Square is employed to test the difference between an
actual sample and another hypothetical or previously established distribution
such as that which may be expected due to chance or probability. Chi Square can
also be used to test differences between two or more actual samples.
Basic Computational Equation
Example:
|
A |
U |
D |
Observed responses (Fo) |
8 |
8 |
14 |
Expected responses (Fe) |
(10) |
(10) |
(10) |
Fo - Fe |
-2 |
-2 |
4 |
(Fo - Fe)2 |
4 |
4 |
16 |
|
.4 |
.4 |
1.6 |
|
|
2.4 |
|
Degrees of freedom - (number of levels - 1) =
2
X2.05 = 5.991 2.4 <
5.991
Therefore, accept null hypothesis.
When there is only one degree of freedom, an
adjustment known as Yates correction for continuity must be employed. To
use this correction, a value of 0.5 is subtracted from the absolute value
(irrespective of algebraic sign) of the numerator contribution of each cell to
the above basic computational formula. The basic chi square computational
formula then becomes:
b)
t Test
"t" is the difference between two
sample means measured in terms of the standard error of those means, or
"t" is a comparison between two groups means which takes into account
the differences in group variation and group size of the two groups. The
statistical hypothesis for the "t" test is stated as the null
hypothesis concerning differences. There is no significant difference in
achievement between group 1 and group 2 on the welding test.
Separate variance formula
Use the separate variance formula if:
Pooled Variance Formula
Use the pooled variance formula if:
Correlated Data Formula
If the samples are related (two measures from
the same subject or matched pairs), the correlated data formula is used.
In choosing the correct formula, it is fairly
easy to determine if the sample sizes are equal. The number of subjects are
either the same or they are not.
However, to determine if the variances are
homogeneous, use the formula F = s2 (largest) / s2
(smallest). We compare the calculated F value to the F table value at the .05
or .01 level of significance with n1 - 1 and n2 - 1
degrees of freedom.
If the calculated values >= table value,
then the variances are not equal; if the calculated value < table value,
then the variances are equal.
Example -
Calculate the "t" value to test for differences between the
achievement of the two samples.
Sample 1
Sample 2
x1 |
|
|
|
x2 |
|
|
1 |
-2 |
4 |
|
1 |
-4 |
16 |
2 |
-1 |
1 |
|
3 |
-2 |
4 |
3 |
0 |
0 |
|
5 |
0 |
0 |
4 |
1 |
1 |
|
7 |
2 |
4 |
5 |
2 |
4 |
|
9 |
4 |
16 |
15 |
0 |
10 |
|
25 |
0 |
40 |
*n - 1
used since n < 30
Test for equal sample sizes and homogeneity
of variances
n1 = n2 = 5
F = s2 (largest)/s2
(smallest) = 10/2.5 = 4 with 4 and 4 degrees of freedom
F.05 with 4 and 4 degrees of
freedom = 6.39
4 < 6.39 so assume s12
= s22
Since sample sizes and variances are equal,
either the separate variance formula or the pooled variance formula may be
used.
Separate Variance Formula
with 8 degrees of freedom
Pooled Variance Formula
with 8 degrees of freedom
As shown in the above example, the degrees of
freedom are calculated differently depending upon whether the n’s and s’s are
equal or not. We must check the degrees of freedom corresponding with the
formula we use.
To test the hypothesis, we compare the
calculated value to the table value for the significance level we have chosen.
If the calculated value >= table value, we reject the null hypothesis and
conclude the difference is greater than that expected by chance. If the
calculated value < table value, we fail to reject the null hypothesis and
conclude this amount of difference could have been the result of chance.
In our example, our calculated value was
-1.265 with 8df and the table value for the .01 level with 8 df was +
3.355. Since |-1.265| < |-3.355|, we accept the null hypothesis and conclude
that the mean difference in achievement between the two samples was no greater
than would be expected by chance.
3.
Explain (Answer in 50 words only)
A. Review of literature and its sources
A literature review is a body of text that aims to review the critical points of current knowledge including substantive findings as well as theoretical and methodological contributions to a particular topic. Literature reviews are secondary sources, and as such, do not report any new or original experimental work.
Most often associated with academic-oriented literature, such as a thesis, a literature review usually precedes a research proposal and results section. Its ultimate goal is to bring the reader up to date with current literature on a topic and forms the basis for another goal, such as future research that may be needed in the area.
A well-structured literature review is characterized by a logical flow of ideas; current and relevant references with consistent, appropriate referencing style; proper use of terminology; and an unbiased and comprehensive view of the previous research on the topic.
B.Data
Data
is a collection of facts, such as values or measurements. It can be numbers,
words, measurements, observations or even just descriptions of things.
Data collection means gathering information to address those critical
evaluation questions that you have identified earlier in the evaluation
process. There are many methods available to gather information, and a wide
variety of information sources. The most important issue related to data
collection is selecting the most appropriate information or evidence to answer
your questions. To plan data collection, you must think about the questions to
be answered and the information sources available. Also, you must begin to
think ahead about how the information could be organized, analyzed, interpreted
and then reported to various audiences.
C.
Information
Information
in its most restricted technical sense is an orderedsequence of symbols that can be interpreted as a message. Information can be recorded as signs, or transmitted as signals.
Information is any kind of event
that affects the state
of a dynamic system.
Conceptually, information is the
message (utterance or expression) being conveyed. This concept has numerous other meanings in
different contexts. Moreover, the concept of information is closely related to
notions of constraint,
communication,
control,
data, form, instruction, knowledge, meaning,
mental stimulus, pattern, perception, representation,
and especially entropy.
D.Data collection
Data collection is a term used to describe a process of preparing and collecting data, for example, as part of a process improvement or similar project. The purpose of data collection is to obtain information to keep on record, to make decisions about important issues, to pass information on to others. Primarily, data are collected to provide information regarding a specific topic.Data collection usually takes place early on in an improvement project, and is often formalized through a data collection plan which often contains the following activity.
E.
Sources of data and information
(i) By observation: This method
implies the collection of information by way of investigator’s own observation,
without interviewing the respondents. The information obtained relates to what
is currently happening and is not complicated by either the past behaviour or
future intentions or attitudes of respondents. This method is no doubt an
expensive method and the information provided by this method is also very
limited. As such this method is not suitable in inquiries where large samples
are concerned.
(ii) Through personal interview: The
investigator follows a rigid procedure and seeks answers to a set of
pre-conceived questions through personal interviews. This method of collecting
data is usually carried out in a structured way where output depends upon the
ability of the interviewer to a large extent.
(iii) Through telephone interviews: This
method of collecting information involves contacting the respondents on
telephone itself. This is not a very widely used method but it plays an
important role in industrial surveys in developed regions, particularly, when
the survey has to be accomplished in a very limited time.
(iv) By mailing of questionnaires: The
researcher and the respondents do come in contact with each other if this
method of survey is adopted. Questionnaires are mailed to the respondents with
a request to return after completing the same. It is the most extensively used
method in various economic and business surveys. Before applying this method,
usually a Pilot Study for testing the questionnaire is conduced which reveals
the weaknesses, if any, of the questionnaire? Questionnaire to be used must be
prepared very carefully so that it may prove to be effective in collecting the
relevant information.
(v) Through schedules: Under this
method the enumerators are appointed and given training.
They are provided with schedules containing
relevant questions. These enumerators go to respondents with these schedules.
Data are collected by filling up the schedules by enumerators on the basis of
replies given by respondents. Much depends upon the capability of enumerators
so far as this method is concerned. Some occasional field checks on the work of
the enumerators may ensure sincere work.
F.
Data Treatment
Two important, though often neglected, parts of an analysis are error analysis and correct results reporting. Results should always be reported along with some estimation of the errors involved. The best way to do this is to report the most likely value along with a confidence interval. The confidence interval gives the range of values thought to contain the "true" value. The statistical treatment of data involves basing the error estimation on firm theoretical principles. This laboratory exercise on treatment of data should help you understand and apply these principles.
G.
Sampling and sampling errors
In statistics and survey methodology, sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population.Researchers rarely survey the entire population because the cost of a census is too high. The three main advantages of sampling are that the cost is lower, data collection is faster, and since the data set is smaller it is possible to ensure homogeneity and to improve the accuracy and quality of the data.
Sampling error
is the deviation of the selected sample from the true characteristics, traits,
behaviors, qualities or figures of the entire population. Sampling process
error occurs because researchers draw different subjects from the same
population but still, the subjects have individual differences. Keep in mind
that when you take a sample, it is only a subset of the entire population;
therefore, there may be a difference between the sample and population.
No comments:
Post a Comment