Economics & Statistics (excel)

The purpose of the project is for you to gain" rel="nofollow">in experience in" rel="nofollow">in applyin" rel="nofollow">ing the methods taught in" rel="nofollow">in the class to a real data set of in" rel="nofollow">interest to you. Objectives At the end of this project, you will be able to: 1.Defin" rel="nofollow">ine a research question and defin" rel="nofollow">ine appropriate variables to measure on a topic that you care about 2. Decide on an experimental design 3. Gather and analyze origin" rel="nofollow">inal data (rather than data prepared for you) on an issue of in" rel="nofollow">interest to you 4. Prepare an appropriate report 5. Provide constructive, thoughtful feedback to peers on their projects Project Suggestions Conductin" rel="nofollow">ing an empirical analysis of economic data can be rewardin" rel="nofollow">ing and in" rel="nofollow">informative. The first step in" rel="nofollow">in conductin" rel="nofollow">ing an empirical analysis is choosin" rel="nofollow">ing the topic you want to study and, within" rel="nofollow">in that topic, the specific question or questions you will in" rel="nofollow">investigate. Although there is not a sin" rel="nofollow">ingle best way to choose a topic, the followin" rel="nofollow">ing suggestions might be useful. Pick a topic that you fin" rel="nofollow">ind personally in" rel="nofollow">interestin" rel="nofollow">ing, ideally one about which you already have some knowledge. The topic might be related to your career in" rel="nofollow">interests, summer work you did, employment experience of a family member, or somethin" rel="nofollow">ing of in" rel="nofollow">intellectual in" rel="nofollow">interest to you. Often, a specific policy problem, a personal decision, or a busin" rel="nofollow">iness issue raises questions that can be addressed by an empirical study. Make the question that will be the main" rel="nofollow">in focus of your study as specific as possible. The more narrowly the question relates to a measurable causal effect, the easier it will be to answer. Check the related literature. You might fin" rel="nofollow">ind published studies on topics closely related to yours. Use previous work to give you ideas about data sources and about what questions have not yet been answered. Choose a question that can be answered usin" rel="nofollow">ing the available data. Although the question you origin" rel="nofollow">inally pose might not be answerable usin" rel="nofollow">ing available data, the data might support the analysis of a related and equally in" rel="nofollow">interestin" rel="nofollow">ing question. Share your topic on the discussion board. If you fin" rel="nofollow">ind your topic in" rel="nofollow">interestin" rel="nofollow">ing, then the odds are that others will too, and an in" rel="nofollow">instructor or classmate might suggest an angle that you have not thought of. As shown in" rel="nofollow">in the table below, the course project consists of 6 checkpoin" rel="nofollow">ints that lead up to, and in" rel="nofollow">include, the fin" rel="nofollow">inal report. Project Format This project relies on multiple regression analysis to analyze a data set that is of in" rel="nofollow">interest to you. The fin" rel="nofollow">inal report for the project should be a 5-10 page sin" rel="nofollow">ingle-spaced paper that describes the questions of in" rel="nofollow">interest, how you used your data set to analyze these questions with details on the steps you used in" rel="nofollow">in your analysis, your fin" rel="nofollow">indin" rel="nofollow">ings about your question of in" rel="nofollow">interest and the limitations of your study. Specifically, your report should contain" rel="nofollow">in the followin" rel="nofollow">ing: 1. Introduction. The in" rel="nofollow">introduction succin" rel="nofollow">inctly states the problem you are in" rel="nofollow">interested in" rel="nofollow">in, briefly describes your data and the method of analysis, and summarizes your main" rel="nofollow">in conclusions. A summary of what you set out to learn, and what you ended up fin" rel="nofollow">indin" rel="nofollow">ing. It should summarize the entire report. 2. Data Description. This section provides the details of the data sources, any transformations you have done to the data (for example, changin" rel="nofollow">ing the units of some variables), gives a table of summary statistics (means and standard deviations) of the variables, and provides scatter-plots and/or other relevant plots of the data. If there are outliers other than those arisin" rel="nofollow">ing from corrected typographical or computer errors, this is the place to poin" rel="nofollow">int them out. 3. Regression Analysis Describe how you used multiple regression to analyze the data set. Specifically, you should discuss how you carried out the steps in" rel="nofollow">in analysis discussed in" rel="nofollow">in class, i.e., exploration of data to fin" rel="nofollow">ind an in" rel="nofollow">initial reasonable model, checkin" rel="nofollow">ing the model and changes to the model based on your checkin" rel="nofollow">ing of the model. 4. Empirical Results This section provides the main" rel="nofollow">in empirical results in" rel="nofollow">in the paper. Conventionally, regression results are presented in" rel="nofollow">in tabular form, with footnotes clearly explain" rel="nofollow">inin" rel="nofollow">ing the entries. The in" rel="nofollow">initial table of results should present the main" rel="nofollow">in results; sensitivity analysis usin" rel="nofollow">ing alternative specifications can be presented in" rel="nofollow">in additional columns in" rel="nofollow">in that table or in" rel="nofollow">in subsequent tables. For organizational purposes and clarity, you may chose to have some tables at the end of the paper, with appropriate references in" rel="nofollow">in the body of the paper as needed. The text should provide a careful discussion of the results, in" rel="nofollow">includin" rel="nofollow">ing assessments both of statistical significance and of economic significance, that is, the magnitude of the estimated relations in" rel="nofollow">in a real-world sense. 5.Summary and Discussion. This section summarizes your main" rel="nofollow">in empirical fin" rel="nofollow">indin" rel="nofollow">ings and discusses their implications for the origin" rel="nofollow">inal question of in" rel="nofollow">interest. Describe any limitations of your study and how they might be overcome in" rel="nofollow">in future research and provide brief conclusions about the results of your study. These are the proejct activities we did for this project. Project Checkpoin" rel="nofollow">int 1: Topic Selection Project Checkpoin" rel="nofollow">int 2: Hypothesis & Research Question Project Checkpoin" rel="nofollow">int 3: Identify Variables for Study Project Checkpoin" rel="nofollow">int 4: Data Sets Project Checkpoin" rel="nofollow">int 5: Regression Analysis Project Checkpoin" rel="nofollow">int 6: Fin" rel="nofollow">inal Report-Sunday, July 28 Project Checkpoin" rel="nofollow">int 1 Topic Selection The model and the data are the startin" rel="nofollow">ing poin" rel="nofollow">ints of an econometric project. The first step in" rel="nofollow">in formulatin" rel="nofollow">ing a model is to select a topic of in" rel="nofollow">interest and to consider the model's scope and purpose. In particular thought should be given to the objectives of the study, what boundaries to place on the topic, what hypotheses might be tested, what variables might be predicted, and what policies might be evaluated. Close attention must be paid, however, to the availability of adequate data. In particular the model must in" rel="nofollow">involve causal relations among measurable variables. The topic selected can be economic or non-economic. It could be a particular market (the market for college graduates, the market for economists, the market for cellular phones), a process (economic development, in" rel="nofollow">inflation, unemployment), demographic phenomena (birth rates, death rates), environmental phenomena (water quality, air quality), political phenomena (elections, votin" rel="nofollow">ing behavior of legislatures), some combin" rel="nofollow">ination of these, or some other topic. You are free to choose the topic of your choice. The topic you choose will require approval from your in" rel="nofollow">instructor. Some paper title examples are presented below: Air pollution and Population Differential Growth in" rel="nofollow">in U.S. Cities Birth Rates, Death Rates, and Economic Growth in" rel="nofollow">in Developin" rel="nofollow">ing Economies Economic and Social Determin" rel="nofollow">inants of Infant Mortality in" rel="nofollow">in the United States The Relationship between Exports and Growth in" rel="nofollow">in Less Developed Countries Remember that these ideas above are merely examples of reasonable topics. You should be origin" rel="nofollow">inal and follow your own in" rel="nofollow">interests. Perhaps the best choice of a topic is one in" rel="nofollow">in which you have prior experience or knowledge. Keep in" rel="nofollow">in min" rel="nofollow">ind that this project is studyin" rel="nofollow">ing the impact of some in" rel="nofollow">independent variable X on a dependent variable Y. But sin" rel="nofollow">ince there are many variables X that have in" rel="nofollow">influence on the variable Y, it is important to in" rel="nofollow">include all those variables on the right hand side of the equation. To ensure that the model is both in" rel="nofollow">interestin" rel="nofollow">ing and manageable, it should contain" rel="nofollow">in at least three to four in" rel="nofollow">independent variables on the right hand side. The model should be formulated as an algebraic, lin" rel="nofollow">inear, stochastic equation along with a correspondin" rel="nofollow">ing verbal statement of the meanin" rel="nofollow">ing of the equation. The expected signs of all the coefficients should be considered. All relevant multipliers, short-run and long-run, should be identified and considered. An example would be: Bein" rel="nofollow">ing a college student, I know that college can be very expensive. However, dependin" rel="nofollow">ing on where you go to school, this price can vary, sometimes with even more then $30,000 difference. For my project I would like to study this variation of tuition price and how it is affected by factors such as school rankin" rel="nofollow">ing, location, private/public standin" rel="nofollow">ing, etc. Project Checkpoin" rel="nofollow">int 2 Hypothesis and Research Question Particular thought should be given to the objectives of the study, what boundaries to place on the topic, what hypotheses might be tested, what variables might be predicted, and what policies might be evaluated. Once you have a general understandin" rel="nofollow">ing of your topic, narrow it down in" rel="nofollow">into a manageable research question or hypothesis. This will help you defin" rel="nofollow">ine the parameters of your research, as well as your argument. A research hypothesis is a statement of expectation or prediction that will be tested by research. Hypotheses look very much like “min" rel="nofollow">ini-arguments”; the objective of the research paper is to present evidence that will prove those hypotheses. Before formulatin" rel="nofollow">ing your research hypothesis, read about the topic of in" rel="nofollow">interest to you. From your readin" rel="nofollow">ing, which may in" rel="nofollow">include articles, books and/or cases, you should gain" rel="nofollow">in sufficient in" rel="nofollow">information about your topic that will enable you to narrow or limit it and express it as a research question. The research question flows from the topic that you are considerin" rel="nofollow">ing. The research question, when stated as one sentence, is your Research Hypothesis. In your hypothesis, you are predictin" rel="nofollow">ing the relationship between variables. Through the disciplin" rel="nofollow">inary in" rel="nofollow">insights gain" rel="nofollow">ined in" rel="nofollow">in the research process throughout the year, you “prove” your hypothesis. This is a process of discovery to create greater understandin" rel="nofollow">ings or conclusions. An example of this checkpoin" rel="nofollow">int would be: The objective of my study is to figure out whether the enormous population in" rel="nofollow">in Chin" rel="nofollow">ina is the main" rel="nofollow">in factor that leads to low per capita GDP in" rel="nofollow">in Chin" rel="nofollow">ina and if there are other factors account for this phenomenon. The boundary is that I would only focus on the correlation between GDP and population in" rel="nofollow">in Chin" rel="nofollow">ina, not other countries. My hypothesis is that the enormous population in" rel="nofollow">in Chin" rel="nofollow">ina is not the strongest factor that leads to low per capita GDP in" rel="nofollow">in Chin" rel="nofollow">ina. Project Checkpoin" rel="nofollow">int 3 Identify Variables for the Study Keep in" rel="nofollow">in min" rel="nofollow">ind that this project is studyin" rel="nofollow">ing the impact of some in" rel="nofollow">independent variable X on a dependent variable Y. But sin" rel="nofollow">ince there are many variables X that have in" rel="nofollow">influence on the variable Y, it is important to in" rel="nofollow">include all those variables on the right hand side of the equation. To ensure that the model is both in" rel="nofollow">interestin" rel="nofollow">ing and manageable, it should contain" rel="nofollow">in at least three to four in" rel="nofollow">independent variables on the right hand side. The model should be formulated as an algebraic, lin" rel="nofollow">inear, stochastic equation along with a correspondin" rel="nofollow">ing verbal statement of the meanin" rel="nofollow">ing of the equation. The expected signs of all the coefficients should be considered. All relevant multipliers, short-run and long-run, should be identified and considered. An example of this section would be: The soccer players’ salaries and their performances/statistics. The dependent variable will in" rel="nofollow">involve salary and my in" rel="nofollow">independent variable would be games played, goal, assist, and a number of yellow or red cards durin" rel="nofollow">ing 2015-16 season. The most important parts would be a number of goals or assist they brought to the game. Salary(y)= b0+ b1x1 + b2x3 + b3x3 + b4x4 From the correlation equation, the expected salary would be positive correlation with the number of goals, games played, assists, and the number of cards would be negative correlation in" rel="nofollow">in the season. Project Checkpoin" rel="nofollow">int 4 Data Sets Before fin" rel="nofollow">indin" rel="nofollow">ing a data set, you must be aware of what data will help you to answer the question you are in" rel="nofollow">investigatin" rel="nofollow">ing. It helps to understand how you in" rel="nofollow">intend to perform your analysis. What unit of observation would be most useful ( local governmental data? in" rel="nofollow">international data? etc.)? In order for you to choose the right data set, you must be clear about what variables you are usin" rel="nofollow">ing before you search for your data set. You should already know what you are usin" rel="nofollow">ing for your dependent variable and what variables will help you answer the research question most effectively. The library sources would be a good place to start in" rel="nofollow">in your search for data. In addition to the material resources available there, you can also seek assistance from the data librarian, who will poin" rel="nofollow">int you in" rel="nofollow">in the right direction. Here are some ideas for data sources that are available for public use: Statistical Abstract of the US Statistical Handbooks Statistical Yearbooks Federal Reserve Economic Data International Economic Conditions National Statistical Abstract Center for Research in" rel="nofollow">in Securities Prices Project Checkpoin" rel="nofollow">int 5 Regression Analysis The first step in" rel="nofollow">in your empirical analysis is gettin" rel="nofollow">ing familiar with the data. Plot the data, usin" rel="nofollow">ing histograms and/or scatterplots. Are there big outliers, and if so are those observations accurately recorded or are they typographical or data manipulation errors? Be very careful when you in" rel="nofollow">input your data as any errors may completely throw off your analysis. Once you feel that the data are is error-free, you can start lookin" rel="nofollow">ing at specific relationships. Are the units of the data the ones you expected, and are they the ones you want to use? Do the relations you see in" rel="nofollow">in the scatterplots make sense? Do relationships look lin" rel="nofollow">inear, or do they look nonlin" rel="nofollow">inear? Once you are acquain" rel="nofollow">inted with your data set, you can begin" rel="nofollow">in your regression analysis. This is the poin" rel="nofollow">int at which all the previous work you have done preparin" rel="nofollow">ing your study begin" rel="nofollow">ins to pay off. Project Part 6: Fin" rel="nofollow">inal Report Project Format This project is relies on multiple regression analysis to analyze a data set that is of in" rel="nofollow">interest to you. The fin" rel="nofollow">inal report for the project should be a 5-10 page paper that describes the questions of in" rel="nofollow">interest, how you used your data set to analyze these questions with details on the steps you used in" rel="nofollow">in your analysis, your fin" rel="nofollow">indin" rel="nofollow">ings about your question of in" rel="nofollow">interest and the limitations of your study. Specifically, your report should contain" rel="nofollow">in the followin" rel="nofollow">ing: Introduction. The in" rel="nofollow">introduction succin" rel="nofollow">inctly states the problem you are in" rel="nofollow">interested in" rel="nofollow">in, briefly describes your data and the method of analysis, and summarizes your main" rel="nofollow">in conclusions. A summary of what you set out to learn, and what you ended up fin" rel="nofollow">indin" rel="nofollow">ing. It should summarize the entire report. Data Description. This section provides the details of the data sources, any transformations you have done to the data (for example, changin" rel="nofollow">ing the units of some variables), gives a table of summary statistics (means and standard deviations) of the variables, and provides scatterplots and/or other relevant plots of the data. If there are outliers other than those arisin" rel="nofollow">ing from corrected typographical or computer errors, this is the place to poin" rel="nofollow">int them out. Regression Analysis.Describe how you used multiple regression to analyze the data set. Specifically, you should discuss how you carried out the steps in" rel="nofollow">in analysis discussed in" rel="nofollow">in class, i.e., exploration of data to fin" rel="nofollow">ind an in" rel="nofollow">initial reasonable model, checkin" rel="nofollow">ing the model and changes to the model based on your checkin" rel="nofollow">ing of the model. Empirical Results. This section provides the main" rel="nofollow">in empirical results in" rel="nofollow">in the paper. Conventionally, regression results are presented in" rel="nofollow">in tabular form, with footnotes clearly explain" rel="nofollow">inin" rel="nofollow">ing the entries. The in" rel="nofollow">initial table of results should present the main" rel="nofollow">in results; sensitivity analysis usin" rel="nofollow">ing alternative specifications can be presented in" rel="nofollow">in additional columns in" rel="nofollow">in that table or in" rel="nofollow">in subsequent tables. For organizational purposes and clarity, you may chose to have some tables at the end of the paper, with appropriate references in" rel="nofollow">in the body of the paper as needed. The text should provide a careful discussion of the results, in" rel="nofollow">includin" rel="nofollow">ing assessments both of statistical significance and of economic significance, that is, the magnitude of the estimated relations in" rel="nofollow">in a real-world sense. Summary and Discussion. This section summarizes your main" rel="nofollow">in empirical fin" rel="nofollow">indin" rel="nofollow">ings and discusses their implications for the origin" rel="nofollow">inal question of in" rel="nofollow">interest. Describe any limitations of your study and how they might be overcome in" rel="nofollow">in future research and provide brief conclusions about the results of your study. This is the rubric for this project: Fin" rel="nofollow">inal Project Rubric INTRODUCTION Poor (60%): The in" rel="nofollow">introductions does not state the problem of in" rel="nofollow">interest or describe the data and method of analysis. It does not summarize main" rel="nofollow">in conclusions, and does not provide a summary of what we set out to learn. There is essentially no in" rel="nofollow">introduction. Fair (70%): There is a short in" rel="nofollow">introduction that may or may not state the problem of in" rel="nofollow">interest. It mentions the data, but does not describe it. No major conclusions are drawn. Good (80%): The in" rel="nofollow">introduction is a bit lengthy or short but does state the problem of in" rel="nofollow">interest, describes data and methods of analysis, although not clearly. Main" rel="nofollow">in conclusions are presented are not summarized. Excellent (90%): The in" rel="nofollow">introduction succin" rel="nofollow">inctly states the problem of in" rel="nofollow">interest, briefly describes the data and method of analysis, and summarizes main" rel="nofollow">in conclusions, a summary of what you set our to learn, and what you ended up fin" rel="nofollow">indin" rel="nofollow">ing. Perfect (100%) DATA DESCRIPTION: Poor (60%): This section does not detail data sources, transformations, nor does it give any summary statistics. It does not provide any graphically relevant in" rel="nofollow">information. Fair (70%): This section provides very little in" rel="nofollow">information in" rel="nofollow">in regards to data sources, transformations, and summary statistics. There are few or no scatter plots or other relevant graphical representations. Good (80%): This section covers data sources, transformations, summary statistics, and scatter plots, and outliers if there are any. However, the order, structure, and presentation of the data does not flow well. Excellent (90%): Provides details of data sources, transformations of the data, gives a table of summary statistics, and provides scatter plots, and other relevant plots of the data. If there are outliers, they are made obvious. Perfect (100%) REGRESSION ANALYSIS Absent (0%): the criteria is completely absent. Poor (60%): The use of multiple regression is not clear nor is the regression model explain" rel="nofollow">ined. There are no references made to class material. Fair (70%): A multiple regression model is evident but is not clearly explain" rel="nofollow">ined. Few, if any, references to the course material is made. Good (80%): The use of multiple regression is made clear, but there is not strong evidence supportin" rel="nofollow">ing the use of a particular model. There are references made to supportin" rel="nofollow">ing course material to justify decisions. Excellent (90%): How multiple regression was used is described. The steps of analysis that were covered in" rel="nofollow">in readin" rel="nofollow">ings/lectures is discussed. The process for decidin" rel="nofollow">ing on a particular regression model is well discussed and justified. Perfect (100%) EMPIRICAL RESULTS Absent (0%): the criteria is completely absent. Poor (60%): This section has no appropriate empirical results presented. There may be regression results that do not refer to the regression model that was outlin" rel="nofollow">ined in" rel="nofollow">in the regression analysis section. There is no discussion surroundin" rel="nofollow">ing the results. Fair (70%): Multiple regression results are presented. However, there is little to no discussion of the results, and if subsequent tables are needed, they are not presented. Good (80%): All results of analysis are presented. However, the discussion of the results is unclear or in" rel="nofollow">inaccurate. Excellent (90%): Regression results are presented in" rel="nofollow">in tabular form with footnotes explain" rel="nofollow">inin" rel="nofollow">ing entries. The table presents main" rel="nofollow">in results while sensitivity analysis or alternative specifications are outlin" rel="nofollow">ined in" rel="nofollow">in subsequent tables. The text provides a careful discussion of the results, in" rel="nofollow">includin" rel="nofollow">ing assessments of both statistical and economical significance. Perfect (100%) SUMMARY AND DISCUSSION Absent (0%): the criteria is completely absent. Poor (60%): There is no summary or discussion of the empirical results. No limitations for the study are presented, and there is no conclusion. Fair (70%): There is little understandin" rel="nofollow">ing or description of the empirical results and their implications for the origin" rel="nofollow">inal question of in" rel="nofollow">interest. There are little to no limitations presented and an unclear conclusion. Good (80%): Empirical fin" rel="nofollow">indin" rel="nofollow">ings are summarized and implications refer to the origin" rel="nofollow">inal question are limited. Limitations of the study are very briefly addressed, and the conclusion of the study could be clearer. Excellent (90%): This section summarizes the main" rel="nofollow">in empirical fin" rel="nofollow">indin" rel="nofollow">ings and discusses implications. Limitations of the study are addressed so that they may be overcome in" rel="nofollow">in the future. A very brief conclusion of the results is presented. Perfect (100%)