This project you will be working on a loan dataset which is used to predict whether a customer will fail to pay a loan. You will apply at least 3 predictive models to the training dataset. Once you analyze the model, you will pick the best model and apply it to a test dataset. The predictive results of the test dataset will be used in a competition among students to determine who has the best model.
Financial institutes take on a certain amount of risk when lending to customers because some customers will not be able to pay the loan back. When this happens the loan is called a default loan. In order to lower the risk, customer information from both healthy and default loans are collected so that we can predict future loan default based on new customers’ loan application.
a. Explain what the technique is;
b. Explain how it is applied;
c. Show the summary results of the preprocessing.
d. Do not include raw code or raw output or raw screen capture in the report , but attach it in the project folder while uploading
a. Explain how the extra information can help you with building the model. Provide specific discussion on which feature in these two data files you think are most important
b. Explain your decision on whether to integrate this extra information to your model building.
c. If you decided on integrating the extra information, explain how you integrate it. i.e. explain how you created new features for your classification training out of these two data files.
d. If you decided on not integrating the extra information, explain your reasoning behind it.
a. Explain what the technique is and how the technique works
b. Explain what the parameters of the technique are and how the parameters are chosen and tuned.
c. Explain and discuss the predictive results and performances of the technique. Analyze different aspect of the result, including but not limited to ROC curve, F score, accuracy, etc.
d. Do not include raw code or raw output or raw screen capture
a. The performance comparison among your techniques
b. Visualization or table showing the performance differences. Make sure you explain and discuss the visualization or table
c. Explain the probable reason behind the performances differences.
d. Explain which technique is the best for the dataset.
- Assignment status: Already Solved By Our Experts
- (USA, AUS, UK & CA PhD. Writers)
- CLICK HERE TO GET A PROFESSIONAL WRITER TO WORK ON THIS PAPER AND OTHER SIMILAR PAPERS, GET A NON PLAGIARIZED PAPER FROM OUR EXPERTS