The Online Payment Fraud Detection project uses machine learning to identify fraudulent transactions in online payment systems. By analyzing various transaction features, this model classifies transactions as either fraudulent or legitimate. 🕵️♂️🔍💡
-
Data Preparation: The dataset is read and preprocessed to handle missing values, transform categorical features, and address outliers. 📊🧹
-
Feature Engineering: Categorical features are converted into numerical values, and feature skewness is adjusted by removing outliers. 🔧📉
-
Exploratory Data Analysis (EDA): Various visualizations and statistical analyses are performed to understand the distribution of features and relationships between variables. 📈📉🔍
-
Model Training: A Decision Tree Classifier is trained on the preprocessed dataset. The model's performance is evaluated on test data. 🏋️♂️📚
-
Prediction: The trained model predicts the fraud status of new transactions based on given features. 🤖📊
The dataset used, named MachineLearningProject.csv
, includes:
Unnamed: 0
: Index columnstep
: Time step (1 hour) ⏳type
: Type of online transaction (e.g., PAYMENT, TRANSFER) 💸amount
: Transaction amount 💵nameOrig
: Originating customer 🏦oldbalanceOrg
: Balance before transaction 💰newbalanceOrig
: Balance after transaction 📈nameDest
: Recipient customer 🏤oldbalanceDest
: Initial balance of recipient 🏦newbalanceDest
: New balance of recipient 📉isFraud
: Indicates if the transaction is fraudulent (1) or not (0) 🚨isFlaggedFraud
: Indicates if the transaction was flagged as fraud (1) or not (0) 🚩
Install the required packages using the provided requirements.txt
file. 🔧
- Load the dataset and preprocess it. 🔍
- Train the model using the prepared data. 🎯
- Evaluate the model to check its performance. ✅📊
- Make predictions on new transaction data. 💡📈
- Model Accuracy: 100% 🏆
- Predictions:
- For certain features, the model predicts
'Fraud'
. 🚨 - For other features, it predicts
'No Fraud'
. ✔️
- For certain features, the model predicts
- Libraries used:
pandas
,numpy
,plotly
,seaborn
,matplotlib
,scikit-learn
🌟