Thursday, May 12, 2022

5-DAY TRAINING ON SPSS FOR BEGINNERS

5-DAY TRAINING ON SPSS FOR BEGINNERS (Batch 2, 1
2nd batch:
5-DAY TRAINING ON 'DATA SUMMARIZATION' USING SPSS 
(OPEN FOR BEGINNERS)

Data summarization of Categorical data 

Bar chart
A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally. A vertical bar chart is sometimes called a column chart.

A horizontal bar chart is a graph in the form of rectangular bars. It's a data visualization technique. The length of these bars is proportional to the values they represent. The bar chart title indicates which data is represented.

This type of chart provides a visual representation of categorical data. This data is grouped together as a group. Examples include months of the year, age range, shoe size, or animal species.

To create a horizontal barchart, first choose an appropriate scale based on the data values. Then mark the data categories on the vertical axis and the values on the horizontal axis.

The bands for each data category should then be drawn. Finally, the appropriate title for the chart is chosen, along with a reference to the scale used.

Vertical bar graph 
A bar graph or bar chart in which the bars are plotted vertically along the y-axis is known as a Vertical Bar Graph. Bar graphs are preferably drawn for the discrete data types. A bar graph is mostly used for quantitative comparison between items or observations.

Based on the structure of the bars and the number of parameters, the bar graph is classified into the following six types.
  • Horizontal bar graph.
  • Vertical bar graph.
  • Double bar graph (Grouped bar graph)
  • Multiple bar graph (Grouped bar graph)
  • Stacked bar graph.
  • Bar line graph.
A stacked bar graph (or stacked bar chart) is a chart that uses bars to show comparisons between categories of data, but with ability to break down and compare parts of a whole. Each bar in the chart represents a whole, and segments in the bar represent different parts or categories of that whole.

Stacked Bar graph can have one category axis and up to two numerical axes. Category axis describes the types of categories being compared, and the numerical axes represent the values of the data.


This type of visualisation depicts items stacked one on top (column) of the other or side-by-side (bar), differentiated by coloured bars or strips. A stacked graph is useful for looking at changes in, for example, expenditures added up over time, across several products or services.



Items are "stacked" in this type of graph allowing the user to add up the underlying data points. Stacked graphs should be used when the sum of the values is as important as the individual items. Stacked graphs are commonly used on bars, to show multiple values for individual categories, or lines, to show multiple values over time. Thus, stacked graphs must always work with positive values.


Stacked bar graphs are often used in evaluation to show the full scale of survey responses, from Strongly Disagree to Strongly Agree, for each survey question. There are two options with the stacked graph – one that shows individual elements (raw data) and one that shows results as a percentage. For the latter, the stacked values should add to 100%.


Stacked line graphs often show how quantities have changed over time, such as school racial composition, where each racial category would correspond to a strip in the graph.


While stacked graphs are helpful for conveying multiple levels of meaning simultaneously, they also have some drawbacks. Though it is fairly easy to interpret the values for the first bar or first strip in the graph, it can be difficult to judge the exact widths of any subsequent strips, or to compare the widths of two strips. If accuracy or comparisons are of primary importance.

Histogram

A histogram is a chart that groups numeric data into bins, displaying the bins as segmented columns. They're used to depict the distribution of a dataset: how often values fall into ranges.





Stacked Bar graph can be used to represent: Ranking, Nominal Comparisons, Part-to-whole, Deviation, or Distribution.

Here are some examples from BusinessQ software;

https://businessq-software.com/2017/02/21/stacked-bar-chart-definition-and-examples-businessq/#:~:text=A%20stacked%20bar%20graph%20(or,or%20categories%20of%20that%20whole.



Confidence interval 

In frequentist statistics, a confidence interval is a range of estimates for an unknown parameter. A confidence interval is computed at a designated confidence level; the 95% confidence level is most common, but other levels, such as 90% or 99%, are sometimes used.




Normal Probability Curve 

The Normal Probability Curve (N.P.C.) is symmetrical about the ordinate of the central point of the curve. It implies that the size, shape and slope of the curve on one side of the curve is identical to that of the other. That is, the normal curve has a bilateral symmetry.


What is the one-sample t-test?

The one-sample t-test is a statistical hypothesis test used to determine whether an unknown population mean is different from a specific value.

When can I use the test?

You can use the test for continuous data. Your data should be a random sample from a normal population.

What if my data isn’t nearly normally distributed?

If your sample sizes are very small, you migh    .  not be able to test for normality. You might need to rely on your understanding of the data. When you cannot safely assume normality, you can perform a nonparametric test that doesn’t assume normality.

Difference between Means

The mean difference (more correctly, 'difference between means') is a standard statistic that measures the absolute difference (negative sign will be positive) between the mean value in two groups in an experimental research. It estimates the amount by which the experimental intervention changes the outcome on average compared with the control.

For example, let's say the mean score on a depression test for a group of 100 middle-aged men is 35 and for 100 middle-aged women it is 25. If you took a large number of samples from both these groups and calculated the mean differences, the mean of all of the differences between all sample means would be 35 – 25 = 10.

Standardized differences 

In statistics, the strictly standardized mean difference (SSMD) is a measure of effect size. It is the mean divided by the standard deviation of a difference between two random values each from one of two groups.

The standardized mean difference is used as a summary statistic in meta-analysis when the studies all assess the same outcome but measure it in a variety of ways (for example, all studies measure depression but they use different psychometric scales). In this circumstance it is necessary to standardize the results of the studies to a uniform scale before they can be combined. The standardized mean difference expresses the size of the intervention effect in each study relative to the variability observed in that study. (Again in reality the intervention effect is a difference in means and not a mean of differences.).

Independent Samples t Test

The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. The Independent Samples t Test is a parametric test.

The Independent Samples t Test is commonly used to test the following:

Note: The Independent Samples t Test can only compare the means for two (and only two) groups. It cannot make comparisons among more than two groups. If you wish to compare the means across more than two groups, you will likely want to run an ANOVA.

Some assumptions 

Note: When one or more of the assumptions for the Independent Samples Test are not met, you may want to run the nonparametric Mann-Whitney U Test instead.

Researchers often follow several rules of thumb:



SPSS GUIDE

TEST OF VARIANCE

LEVIN'S TEST OF VARIANCE

In statistics, Levene's test is an inferential statistic used to assess the equality of variances for a variable calculated for two or more groups.[1] Some common statistical procedures assume that variances of the populations from which different samples are drawn are equal. Levene's test assesses this assumption. It tests the null hypothesis that the population variances are equal (called homogeneity of variance or homoscedasticity). If the resulting p-value of Levene's test is less than some significance level (typically 0.05), the obtained differences in sample variances are unlikely to have occurred based on random sampling from a population with equal variances. Thus, the null hypothesis of equal variances is rejected and it is concluded that there is a difference between the variances in the population.

Some of the procedures typically assuming homoscedasticity, for which one can use Levene's tests, include analysis of variance and t-tests.

Levene's test is sometimes used before a comparison of means, informing the decision on whether to use a pooled t-test or the Welch's t-test. However, it was shown that such a two-step procedure may markedly inflate the type 1 error obtained with the t-tests and thus should not be done in the first place.[2] Instead, the choice of pooled or Welch's test should be made a priori based on the study design










==////===============/=========//////===
____________________________

*5-DAY TRAINING ON SPSS FOR BEGINNERS*
*Organized by*: Rabindrik Psychotherapy Research Institute Trust.
*Date and Time*: 17-21st May, 8-9 AM.
*Registration fees*: Rs.500.
*Qualifications*: Graduation and above.
*Eligibility*: Good writing skill in English.
*Program Schedule*: It is divided into two lecture sessions and assignments. 
Assignments include writing the lecture contents and presentation.
Day 1: Introduction-scope, importance and applications, windows
Day 2: Platforms of SPSS- Variable, data editor, syntax editor, output editor.
Day 3: Scales of Measurement. Data quality change-
Day 4: Case management.
Day 5: Data visualization.
Payment to: Axis bank, Dunlop bridge. Account number:920020072908427, IFSC code: UTIB0000236. In favour of: Rabindrik Psychotherapy Research Institute Trust.
*Contact to*: rpri.edu@gmail.com.
*Certificate*: Participation Certificate will be provided. 

Courses 

Day 1: Introduction-scope, importance and applications, windows.

1. Definition:
SPSS is a Windows based program that can be used to perform data entry and analysis and to create tables and graphs. SPSS is capable of handling large amounts of data and can perform all of the analyses covered in the text and much more. It was originally launched in 1968 by SPSS Inc., and was later acquired by IBM in 2009. 

2. Scope: 
SPSS is short for Statistical Package for the Social Sciences, and it's used by various kinds of researchers for complex statistical data analysis. The SPSS software package was created for the management and statistical analysis of social science data.
 SPSS is used by market researchers, health researchers, survey companies, government entities, education researchers, marketing organizations, data miners, and many more for processing and analyzing survey data.

3. Core functions of SPSS:

SPSS offers four programs that assist researchers with your complex data analysis needs.

Statistics Program

SPSS’s Statistics program provides a plethora of basic statistical functions, some of which include frequencies, cross-tabulation, and bivariate statistics.

Modeler Program

SPSS’s Modeler program enables researchers to build and validate predictive models using advanced statistical procedures.

Text Analytics for Surveys Program

SPSS’s Text Analytics for Surveys program helps survey administrators uncover powerful insights from responses to open-ended survey questions.

Visualization Designer

SPSS’s Visualization Designer program allows researchers to use their data to create a wide variety of visuals like density charts and radial boxplots from their survey data with ease.

In addition to the four programs mentioned above, SPSS also provides solutions for data management, which allow researchers to perform case selection, create derived data, and perform file reshaping. 

SPSS also offers data documentation, which allows researchers to store a metadata dictionary. This metadata dictionary acts as a centralized repository of information pertaining to the data, such as meaning, relationships to other data, origin, usage, and format.

There are a handful of statistical methods that can be leveraged in SPSS, including: 

  • Descriptive statistics, including methodologies such as frequencies, cross-tabulation, and descriptive ratio statistics.
  • Bivariate statistics, including methodologies such as analysis of variance (ANOVA), means, correlation, and nonparametric tests.
  • Numeral outcome prediction such as linear regression.
  • Prediction for identifying groups, including methodologies such as cluster analysis and factor analysis..
Ref: https://www.alchemer.com/resources/blog/spss-survey-variables/


SPSS WINDOWS 

SPSS utilizes multiple types of windows, or screens, in its basic operations. Each window is associated with specific tasks and types of SPSS files. The windows include the Data Editor, Output Viewer, Syntax Editor, Pivot Table Editor, Chart Editor, and Text Output Editor.

DATA EDITOR

The Data Editor window is the default window and opens when SPSS starts. This window displays the content of any open data files and provides drop-down menus that allow you to modify and analyze data. The data are displayed in a spreadsheet format where columns represent variables and rows represent cases. The spreadsheet format includes two tabs at the bottom labeled Data View and Variable View. The Data View tab displays the open data set: variables appear in columns, and cases appear in rows. The Variable View tab displays information about variables in the open data (but not the data themselves), such as variable names, types, and labels, etc. 

OUTPUT VIEWER

When you perform any command in SPSS, the Output Viewer window opens automatically and displays a log of the actions taken and the associated output. Primarily, the Output Viewer is where the results of statistical analysis are shown, but any command invoked through the drop-down menus or syntax will be printed to the Output Viewer. This includes opening, closing, or saving a data file. If an Output Viewer window is not open when a command is run, a new Output Viewer window will automatically be created.


The Output Viewer window is divided into two sections, or frames. The left frame contains an outline of the content in the Output Viewer. This outline is especially useful when you have run many SPSS commands and need to locate a particular section of output easily. The right frame contains the actual output. Clicking on an item in the left frame will jump to that content in the right frame. Items that have been selected in the right frame are indicated by a red arrow and a box drawn around the content.

You can modify the contents in the Output Viewer by selecting items in the left or right frame and copying, pasting, or deleting them. To remove an item from the Output Viewer, click on its name in the left frame or click on the object itself in the right frame, then press the Delete key on your keyboard.

An Output Viewer window can be saved as a viewer file (*.spv) so that you can review it again without having to re-run the same commands in SPSS. To save an Output Viewer window, click File > Save As. Alternatively, you can export some or all of the contents in the Viewer window to a new document or image file by clicking File > Export. In general, you can export all content as a PDF (*.pdf), a PowerPoint file (*.ppt), an Excel file (*.xls or *.xlsx), a Word file (*.doc or *.docx), an HTML file (*.htm), or a text file (*.txt). Graphs can be saved as *.bmp, *.emf, *.eps, *.jpeg, *.png, or *.tif.

SYNTAX EDITOR

SPSS syntax is a programming language unique to SPSS that can be used as an alternative to the drop-down menus for data manipulation and statistical analyses. The Syntax Editor window is where users can write, debug,  and execute SPSS syntax. To open a new Syntax Editor window, click File > New > Syntax.

The right panel of the Syntax Editor window is where your syntax is entered. The left panel of the Syntax Editor window shows an outline of the commands in your syntax, and can be used to navigate within your code. You can jump to a specific part of your code by clicking on the command in the left panel. This feature is useful for showing the start and end points of a command, especially if the command is longer than one line.

Syntax can be saved as an *.sps file by clicking File > Save or File > Save As within the Syntax Editor window.

Syntax Commands: Advanced users can interact with SPSS by writing their own syntax. Syntax is a command-driven language that tells SPSS what actions to perform on the data. Using syntax commands (rather than drop-down menus) is preferable for several reasons:

Overall, syntax offers more flexibility, a clearer record, and greater ease in making changes and re-running commands. It does take some practice to learn to write the basic command language, but once you learn the language the benefits of working with data in this way will become very clear.

To use syntax, click File > New > Syntax. This opens a new Syntax Editor window where you can write and execute syntax commands.

ref: https://libguides.library.kent.edu/spss/environment

Day 2: Platforms of SPSS- Variable, data editor, syntax editor, output editor.

====================================
Day 3

Creating and Manipulating Variables in SPSS

One of the best ways analysts can add value is by finding new ways to examine data. There are many ways to change data that allow for further, more specific, analysis.

Here are a few of the common ones that I do regularly using the techniques we’ve been dicussing:

  1. Create variable that summarize others (Average, Sum, etc.)
  2. Create interval variable from a well grouped categorical variable
  3. Create bimodal variable from categorical variable
  4. Regroup categorical variables to increase size of sub-groups
  5. Regroup several variables into one
  6. Mathematically transform the variable (Log, Square, Square root, etc.)
  7. Set specific values to missing so they do not get used in analysis
  8. Replacing missing values with a valid value
====================================
Day 3
Scales of Measurement 

Levels of Measurement in Statistics

To perform statistical analysis of data, it is important to first understand variables and what should be measured using these variables. There are different levels of measurement in statistics and data measured using them can be broadly classified into qualitative and quantitative data.

First,  let’s understand what a variable is.  A quantity whose value changes across the population and can be measured is called variable. For instance, consider a sample of employed individuals. The variables for this set of the population can be industry, location, gender, age, skills, job-type, etc The value of the variables will differ with each employee. 

What are Nominal, Ordinal, Interval and Ratio Scales?

Nominal, Ordinal, Interval, and Ratio are defined as the four fundamental levels of measurement scales that are used to capture data in the form of surveys and questionnaires, each being a multiple choice question

Each scale is an incremental level of measurement, meaning, each scale fulfills the function of the previous scale, and all survey question scales such as LikertSemantic DifferentialDichotomous, etc, are the derivation of this these 4 fundamental levels of variable measurement. Before we discuss all four levels of measurement scales in details, with examples, let’s have a quick brief look at what these scales represent.

Nominal scale is a naming scale, where variables are simply “named” or labeled, with no specific order. Ordinal scale has all its variables in a specific order, beyond just naming them. Interval scale offers labels, order, as well as, a specific interval between each of its variable options.  Ratio scale bears all the characteristics of an interval scale, in addition to that, it can also accommodate the value of “zero” on any of its variables.

Nominal Scale: 1st Level of Measurement

Nominal Scale, also called the categorical variable scale, is defined as a scale used for labeling variables into distinct classifications and doesn’t involve a quantitative value or order. This scale is the simplest of the four variable measurement scales. Calculations done on these variables will be futile as there is no numerical value of the options.

There are cases where this scale is used for the purpose of classification – the numbers associated with variables of this scale are only tags for categorization or division. Calculations done on these numbers will be futile as they have no quantitative significance.

For a question such as:

Where do you live?

  • 1- Suburbs
  • 2- City
  • 3- Town

Nominal scale is often used in research surveys and questionnaires where only variable labels hold significance.

For instance, a customer survey asking “Which brand of smartphones do you prefer?” Options : “Apple”- 1 , “Samsung”-2, “OnePlus”-3.

  • In this survey question, only the names of the brands are significant for the researcher conducting consumer research. There is no need for any specific order for these brands. However, while capturing nominal data, researchers conduct analysis based on the associated labels.
  • In the above example, when a survey respondent selects Apple as their preferred brand, the data entered and associated will be “1”. This helped in quantifying and answering the final question – How many respondents selected Apple, how many selected Samsung, and how many went for OnePlus – and which one is the highest.
  • This is the fundamental of quantitative research, and nominal scale is the most fundamental research scale.

Nominal Scale Data and Analysis

There are two primary ways in which nominal scale data can be collected:

  1. By asking an open-ended question, the answers of which can be coded to a respective number of label decided by the researcher.
  2. The other alternative to collect nominal data is to include a multiple choice question in which the answers will be labeled.

In both cases, the analysis of gathered data will happen using percentages or mode,i.e., the most common answer received for the question. It is possible for a single question to have more than one mode as it is possible for two common favorites can exist in a target population.  

Nominal Scale SPSS

In SPSS, you can specify the level of measurement as scale (numeric data on an interval or ratio scale), ordinal, or nominal. Nominal and ordinal data can be either string alphanumeric or numeric.

Upon importing the data for any variable into the SPSS input file, it takes it as a scale variable by default since the data essentially contains numeric values. It is important to change it to either nominal or ordinal or keep it as scale depending on the variable the data represents.

Ordinal Scale: 2nd Level of Measurement

Ordinal Scale is defined as a variable measurement scale used to simply depict the order of variables and not the difference between each of the variables. These scales are generally used to depict non-mathematical ideas such as frequency, satisfaction, happiness, a degree of pain, etc. It is quite straightforward to remember the implementation of this scale as ‘Ordinal’ sounds similar to ‘Order’, which is exactly the purpose of this scale.

Ordinal Scale maintains descriptional qualities along with an intrinsic order but is void of an origin of scale and thus, the distance between variables can’t be calculated. Descriptional qualities indicate tagging properties similar to the nominal scale, in addition to which, the ordinal scale also has a relative position of variables. Origin of this scale is absent due to which there is no fixed start or “true zero”.

Ordinal Data and Analysis  

Ordinal scale data can be presented in tabular or graphical formats for a researcher to conduct a convenient analysis of collected data. Also, methods such as Mann-Whitney U test and Kruskal–Wallis H test can also be used to analyze ordinal data. These methods are generally implemented to compare two or more ordinal groups.

In the Mann-Whitney U test, researchers can conclude which variable of one group is bigger or smaller than another variable of a randomly selected group. While in the Kruskal–Wallis H test, researchers can analyze whether two or more ordinal groups have the same median or not.
















*5-DAY TRAINING ON SPSS FOR BEGINNERS*
*Organized by*: Rabindrik Psychotherapy Research Institute Trust.
*Date and Time*: 17-21st May, 8-9 AM.
*Registration fees*: Rs.500.
*Qualifications*: Graduation and above.
*Eligibility*: Good writing skill in English.
*Program Schedule*: It is divided into two lecture sessions and assignments. 
Assignments include writing the lecture contents and presentation.
Day 1: Introduction-scope, importance and applications, windows
Day 2: Platforms of SPSS- Variable, data editor, syntax editor, output editor.
Day 3: Scales of Measurement. Data quality change-
Day 4: Case management.
Day 5: Data visualization.
Payment to: Axis bank, Dunlop bridge. Account number:920020072908427, IFSC code: UTIB0000236. In favour of: Rabindrik Psychotherapy Research Institute Trust.
*Contact to*: rpri.edu@gmail.com.
*Certificate*: Participation Certificate will be provided. 

Courses 

Day 1: Introduction-scope, importance and applications, windows.

1. Definition:
SPSS is a Windows based program that can be used to perform data entry and analysis and to create tables and graphs. SPSS is capable of handling large amounts of data and can perform all of the analyses covered in the text and much more. It was originally launched in 1968 by SPSS Inc., and was later acquired by IBM in 2009. 

2. Scope: 
SPSS is short for Statistical Package for the Social Sciences, and it's used by various kinds of researchers for complex statistical data analysis. The SPSS software package was created for the management and statistical analysis of social science data.
 SPSS is used by market researchers, health researchers, survey companies, government entities, education researchers, marketing organizations, data miners, and many more for processing and analyzing survey data.

3. Core functions of SPSS:

SPSS offers four programs that assist researchers with your complex data analysis needs.

Statistics Program

SPSS’s Statistics program provides a plethora of basic statistical functions, some of which include frequencies, cross-tabulation, and bivariate statistics.

Modeler Program

SPSS’s Modeler program enables researchers to build and validate predictive models using advanced statistical procedures.

Text Analytics for Surveys Program

SPSS’s Text Analytics for Surveys program helps survey administrators uncover powerful insights from responses to open-ended survey questions.

Visualization Designer

SPSS’s Visualization Designer program allows researchers to use their data to create a wide variety of visuals like density charts and radial boxplots from their survey data with ease.

In addition to the four programs mentioned above, SPSS also provides solutions for data management, which allow researchers to perform case selection, create derived data, and perform file reshaping. 

SPSS also offers data documentation, which allows researchers to store a metadata dictionary. This metadata dictionary acts as a centralized repository of information pertaining to the data, such as meaning, relationships to other data, origin, usage, and format.

There are a handful of statistical methods that can be leveraged in SPSS, including: 

  • Descriptive statistics, including methodologies such as frequencies, cross-tabulation, and descriptive ratio statistics.
  • Bivariate statistics, including methodologies such as analysis of variance (ANOVA), means, correlation, and nonparametric tests.
  • Numeral outcome prediction such as linear regression.
  • Prediction for identifying groups, including methodologies such as cluster analysis and factor analysis..
Ref: https://www.alchemer.com/resources/blog/spss-survey-variables/


SPSS WINDOWS 

SPSS utilizes multiple types of windows, or screens, in its basic operations. Each window is associated with specific tasks and types of SPSS files. The windows include the Data Editor, Output Viewer, Syntax Editor, Pivot Table Editor, Chart Editor, and Text Output Editor.

DATA EDITOR

The Data Editor window is the default window and opens when SPSS starts. This window displays the content of any open data files and provides drop-down menus that allow you to modify and analyze data. The data are displayed in a spreadsheet format where columns represent variables and rows represent cases. The spreadsheet format includes two tabs at the bottom labeled Data View and Variable View. The Data View tab displays the open data set: variables appear in columns, and cases appear in rows. The Variable View tab displays information about variables in the open data (but not the data themselves), such as variable names, types, and labels, etc. 

OUTPUT VIEWER

When you perform any command in SPSS, the Output Viewer window opens automatically and displays a log of the actions taken and the associated output. Primarily, the Output Viewer is where the results of statistical analysis are shown, but any command invoked through the drop-down menus or syntax will be printed to the Output Viewer. This includes opening, closing, or saving a data file. If an Output Viewer window is not open when a command is run, a new Output Viewer window will automatically be created.


The Output Viewer window is divided into two sections, or frames. The left frame contains an outline of the content in the Output Viewer. This outline is especially useful when you have run many SPSS commands and need to locate a particular section of output easily. The right frame contains the actual output. Clicking on an item in the left frame will jump to that content in the right frame. Items that have been selected in the right frame are indicated by a red arrow and a box drawn around the content.

You can modify the contents in the Output Viewer by selecting items in the left or right frame and copying, pasting, or deleting them. To remove an item from the Output Viewer, click on its name in the left frame or click on the object itself in the right frame, then press the Delete key on your keyboard.

An Output Viewer window can be saved as a viewer file (*.spv) so that you can review it again without having to re-run the same commands in SPSS. To save an Output Viewer window, click File > Save As. Alternatively, you can export some or all of the contents in the Viewer window to a new document or image file by clicking File > Export. In general, you can export all content as a PDF (*.pdf), a PowerPoint file (*.ppt), an Excel file (*.xls or *.xlsx), a Word file (*.doc or *.docx), an HTML file (*.htm), or a text file (*.txt). Graphs can be saved as *.bmp, *.emf, *.eps, *.jpeg, *.png, or *.tif.

SYNTAX EDITOR

SPSS syntax is a programming language unique to SPSS that can be used as an alternative to the drop-down menus for data manipulation and statistical analyses. The Syntax Editor window is where users can write, debug,  and execute SPSS syntax. To open a new Syntax Editor window, click File > New > Syntax.

The right panel of the Syntax Editor window is where your syntax is entered. The left panel of the Syntax Editor window shows an outline of the commands in your syntax, and can be used to navigate within your code. You can jump to a specific part of your code by clicking on the command in the left panel. This feature is useful for showing the start and end points of a command, especially if the command is longer than one line.

Syntax can be saved as an *.sps file by clicking File > Save or File > Save As within the Syntax Editor window.

Syntax Commands: Advanced users can interact with SPSS by writing their own syntax. Syntax is a command-driven language that tells SPSS what actions to perform on the data. Using syntax commands (rather than drop-down menus) is preferable for several reasons:

Overall, syntax offers more flexibility, a clearer record, and greater ease in making changes and re-running commands. It does take some practice to learn to write the basic command language, but once you learn the language the benefits of working with data in this way will become very clear.

To use syntax, click File > New > Syntax. This opens a new Syntax Editor window where you can write and execute syntax commands.

ref: https://libguides.library.kent.edu/spss/environment

Day 2: Platforms of SPSS- Variable, data editor, syntax editor, output editor.

====================================
Day 3

Creating and Manipulating Variables in SPSS

One of the best ways analysts can add value is by finding new ways to examine data. There are many ways to change data that allow for further, more specific, analysis.

Here are a few of the common ones that I do regularly using the techniques we’ve been dicussing:

  1. Create variable that summarize others (Average, Sum, etc.)
  2. Create interval variable from a well grouped categorical variable
  3. Create bimodal variable from categorical variable
  4. Regroup categorical variables to increase size of sub-groups
  5. Regroup several variables into one
  6. Mathematically transform the variable (Log, Square, Square root, etc.)
  7. Set specific values to missing so they do not get used in analysis
  8. Replacing missing values with a valid value
====================================
Day 3
Scales of Measurement 

Levels of Measurement in Statistics

To perform statistical analysis of data, it is important to first understand variables and what should be measured using these variables. There are different levels of measurement in statistics and data measured using them can be broadly classified into qualitative and quantitative data.

First,  let’s understand what a variable is.  A quantity whose value changes across the population and can be measured is called variable. For instance, consider a sample of employed individuals. The variables for this set of the population can be industry, location, gender, age, skills, job-type, etc The value of the variables will differ with each employee. 

What are Nominal, Ordinal, Interval and Ratio Scales?

Nominal, Ordinal, Interval, and Ratio are defined as the four fundamental levels of measurement scales that are used to capture data in the form of surveys and questionnaires, each being a multiple choice question

Each scale is an incremental level of measurement, meaning, each scale fulfills the function of the previous scale, and all survey question scales such as LikertSemantic DifferentialDichotomous, etc, are the derivation of this these 4 fundamental levels of variable measurement. Before we discuss all four levels of measurement scales in details, with examples, let’s have a quick brief look at what these scales represent.

Nominal scale is a naming scale, where variables are simply “named” or labeled, with no specific order. Ordinal scale has all its variables in a specific order, beyond just naming them. Interval scale offers labels, order, as well as, a specific interval between each of its variable options.  Ratio scale bears all the characteristics of an interval scale, in addition to that, it can also accommodate the value of “zero” on any of its variables.

Nominal Scale: 1st Level of Measurement

Nominal Scale, also called the categorical variable scale, is defined as a scale used for labeling variables into distinct classifications and doesn’t involve a quantitative value or order. This scale is the simplest of the four variable measurement scales. Calculations done on these variables will be futile as there is no numerical value of the options.

There are cases where this scale is used for the purpose of classification – the numbers associated with variables of this scale are only tags for categorization or division. Calculations done on these numbers will be futile as they have no quantitative significance.

For a question such as:

Where do you live?

  • 1- Suburbs
  • 2- City
  • 3- Town

Nominal scale is often used in research surveys and questionnaires where only variable labels hold significance.

For instance, a customer survey asking “Which brand of smartphones do you prefer?” Options : “Apple”- 1 , “Samsung”-2, “OnePlus”-3.

  • In this survey question, only the names of the brands are significant for the researcher conducting consumer research. There is no need for any specific order for these brands. However, while capturing nominal data, researchers conduct analysis based on the associated labels.
  • In the above example, when a survey respondent selects Apple as their preferred brand, the data entered and associated will be “1”. This helped in quantifying and answering the final question – How many respondents selected Apple, how many selected Samsung, and how many went for OnePlus – and which one is the highest.
  • This is the fundamental of quantitative research, and nominal scale is the most fundamental research scale.

Nominal Scale Data and Analysis

There are two primary ways in which nominal scale data can be collected:

  1. By asking an open-ended question, the answers of which can be coded to a respective number of label decided by the researcher.
  2. The other alternative to collect nominal data is to include a multiple choice question in which the answers will be labeled.

In both cases, the analysis of gathered data will happen using percentages or mode,i.e., the most common answer received for the question. It is possible for a single question to have more than one mode as it is possible for two common favorites can exist in a target population.  

Nominal Scale SPSS

In SPSS, you can specify the level of measurement as scale (numeric data on an interval or ratio scale), ordinal, or nominal. Nominal and ordinal data can be either string alphanumeric or numeric.

Upon importing the data for any variable into the SPSS input file, it takes it as a scale variable by default since the data essentially contains numeric values. It is important to change it to either nominal or ordinal or keep it as scale depending on the variable the data represents.

Ordinal Scale: 2nd Level of Measurement

Ordinal Scale is defined as a variable measurement scale used to simply depict the order of variables and not the difference between each of the variables. These scales are generally used to depict non-mathematical ideas such as frequency, satisfaction, happiness, a degree of pain, etc. It is quite straightforward to remember the implementation of this scale as ‘Ordinal’ sounds similar to ‘Order’, which is exactly the purpose of this scale.

Ordinal Scale maintains descriptional qualities along with an intrinsic order but is void of an origin of scale and thus, the distance between variables can’t be calculated. Descriptional qualities indicate tagging properties similar to the nominal scale, in addition to which, the ordinal scale also has a relative position of variables. Origin of this scale is absent due to which there is no fixed start or “true zero”.

Ordinal Data and Analysis  

Ordinal scale data can be presented in tabular or graphical formats for a researcher to conduct a convenient analysis of collected data. Also, methods such as Mann-Whitney U test and Kruskal–Wallis H test can also be used to analyze ordinal data. These methods are generally implemented to compare two or more ordinal groups.

In the Mann-Whitney U test, researchers can conclude which variable of one group is bigger or smaller than another variable of a randomly selected group. While in the Kruskal–Wallis H test, researchers can analyze whether two or more ordinal groups have the same median or not.









No comments:

Post a Comment