PythonPlaza.com

Support Vector Machine(SVM)

Support Vector Machine, or SVM, is a type of machine learning that is used for both classification and regression. It works by finding the best line, called a Decision Boundary that separates different groups in the data. SVM is helpful when you need to sort things into two groups, like identifying if an email is spam or not, or if an image is of a cat or a dog.

Support vectors are the key points in the data that are closest to the line that separates the groups. The margin is the space between this line and the nearest points from each group.

USE CASE 1: Use Support Vector Machine(SVM) with scikit-learn if a loan will default or not. Dependent variable: Default (0 = No Default, 1 = Default) Independent variables (3): Income (monthly income, e.g., 1000–10000) CreditScore (300–850) LoanAmount (1000–50000).

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # ----------------------------------- # 1. Load data from Excel # ----------------------------------- data = pd.read_excel("loan_data.xlsx") df = pd.DataFrame(data) print("Dataset Preview:") print(data.head()) # ----------------------------------- # 2. Define features and target # ----------------------------------- X = df[["Income", "CreditScore", "LoanAmount"]] y = df["Default"] # ----------------------------------- # 3. Split into training and testing # ----------------------------------- X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, random_state=42, stratify=y ) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # ----------------------------------- # 4. Train the Support Vector Machine(SVM) model # ----------------------------------- svm_model = SVC( kernel='rbf', # nonlinear boundary C=1.0, # regularization parameter gamma='scale', probability=True, # allows predict_proba random_state=42 ) svm_model.fit(X_train_scaled, y_train) Step 8: Evaluate the model print("Accuracy:", accuracy_score(y_test, y_pred)) print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred)) print("\nClassification Report:\n", classification_report(y_test, y_pred)) Step 9: Predict default for a new customer new_customer = [[4500, 620, 16000]] # Income, CreditScore, LoanAmount default_prediction = svm_model.predict(new_customer) default_probability = svm_model.predict_proba(new_customer)[0][1] print("Default Prediction:", default_prediction[0]) print("Probability of Default:", default_probability)

USE CASE 2: Customer Churn example using Support Vector Machine(SVM) with scikit-learn in Python. We’ll assume 4 independent variables, for example: Tenure (months with company) - 1–60 months MonthlyCharges (amount billed per month) - 30–120 ContractType (0=Month-to-month, 1=One-year, 2=Two-year) SupportCalls (number of calls to support) 0–10 The dependent variable is Churn (0=Stay, 1=Churn)..

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # ----------------------------------- # 1. Load data from Excel # ----------------------------------- #sample data can be exported to #excel from the URL # https://www.pythonplaza.com/categorical_customer_churn_1_or_0.html data = pd.read_excel("customer_data.xlsx") print("Dataset Preview:") print(data.head()) # ----------------------------------- # 2. Define features and target # ----------------------------------- X = df[["Tenure", "MonthlyCharges", "ContractType", "SupportCalls"]] y = df["Churn"] # ----------------------------------- # 3. Split into training and testing # ----------------------------------- X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.25, random_state=42 ) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # ----------------------------------- # 4. Train the Support Vector Machine(SVM) model # ----------------------------------- svm_model = SVC( kernel='rbf', # nonlinear boundary C=1.0, # regularization parameter gamma='scale', probability=True, # allows predict_proba random_state=42 ) svm_model.fit(X_train_scaled, y_train) # ----------------------------- # Evaluation y_pred = model.predict(X_test) y_prob = model.predict_proba(X_test)[:, 1] # ----------------------------------- # Evaluate the model # ----------------------------------- print("Accuracy:", accuracy_score(y_test, y_pred)) print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred)) print("\nClassification Report:\n", classification_report(y_test, y_pred)) #Predict churn for a new customer new_customer = [[8, 92, 0, 5]] # Tenure, MonthlyCharges, ContractType, SupportCalls churn_prediction = svm_model.predict(new_customer) churn_probability = svm_model.predict_proba(new_customer)[0][1] print("Churn Prediction:", churn_prediction[0]) print("Probability of Churn:", churn_probability) #Interpreting the results (business view) 1 → High risk of churn ⚠️ 0 → Likely to stay ✅ Use probability (e.g., churn > 0.6) to trigger retention offers

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # ----------------------------------- # 1. Load data from Excel # ----------------------------------- #sample data can be exported to #excel from the URL Get the Categorical learning Styles data in Excel data = pd.read_excel("Categorical_learning_Styles.xlsx") print("Dataset Preview:") print(data.head()) # ----------------------------------- # 2. Define the data # ----------------------------------- X = df[['prefers_diagrams', 'prefers_lectures', 'prefers_notes', 'prefers_hands_on']] y = df['learning_style'] # Encode categorical target labels le = LabelEncoder() y_encoded = le.fit_transform(y) # ----------------------------- # 3. Train-Test Split # ----------------------------- X_train, X_test, y_train, y_test = train_test_split( X, y_encoded, test_size=0.3, random_state=42 ) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # ----------------------------------- # 4. Train the Support Vector Machine(SVM) model # ----------------------------------- svm_model = SVC( kernel='rbf', # nonlinear boundary C=1.0, # regularization parameter gamma='scale', probability=True, # allows predict_proba random_state=42 ) svm_model.fit(X_train_scaled, y_train) # ----------------------------- # 5. Make Predictions # ----------------------------- y_pred = svm_model.predict(X_test) # ----------------------------- # 6. Evaluate Model # ----------------------------- print("Accuracy:", accuracy_score(y_test, y_pred)) print("\nClassification Report:\n") print(classification_report(y_test, y_pred, target_names=le.classes_)) #Predict with sample data new_students = np.array([ [5, 1, 2, 1], # Likely Visual [1, 5, 3, 2], # Likely Auditory [2, 1, 5, 2], # Likely Reading/Writing [1, 2, 1, 5] # Likely Kinesthetic ]) # Predict encoded labels predictions_encoded = svm_model.predict(new_students) # Convert numeric predictions back to original labels predictions = le.inverse_transform(predictions_encoded) print("Predicted Learning Styles:") print(predictions)

import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.svm import SVC from sklearn.metrics import accuracy_score, confusion_matrix, classification_report # ----------------------------------- # 1. Load data from Excel # ----------------------------------- #sample data can be exported to #excel from the URL Get Disease Classification in Excel data = pd.read_excel("patient_dosage_response.xlsx") print("Dataset Preview:") print(data.head()) df = pd.DataFrame(data) # ---------------------------------- # 2. Separate Features and Target # ---------------------------------- X = df[['Age', 'BloodPressure', 'Cholesterol', 'FamilyHistory']] y = df['Disease'] # ---------------------------------- # 3. Train-Test Split # ---------------------------------- X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=42 ) scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test) # ----------------------------------- # 4. Train the Support Vector Machine(SVM) model # ----------------------------------- svm_model = SVC( kernel='rbf', # nonlinear boundary C=1.0, # regularization parameter gamma='scale', probability=True, # allows predict_proba random_state=42 ) svm_model.fit(X_train_scaled, y_train) # ---------------------------------- # 5. Make Predictions # ---------------------------------- y_pred = svm_model.predict(X_test) # ---------------------------------- # 6. Evaluate Model # ---------------------------------- print("Accuracy:", accuracy_score(y_test, y_pred)) print("\nClassification Report:\n") print(classification_report(y_test, y_pred)) #Predict with New Sample Data (NumPy Array) # New patient data # Format: [Age, BloodPressure, Cholesterol, FamilyHistory] new_patients = np.array([ [45, 150, 230, 1], # High risk [28, 118, 175, 0] # Low risk ]) predictions = svm_model.predict(new_patients) print("Disease Predictions:") print(predictions)

Supervised Machine Learning Algorithms

Support Vector Machine(SVM)