Data-Science-Projects

A collection of data science projects to help me practice working with data across diverse domains

Anomaly Detection for Financial Data (01/26/26)

Binary classification competition for anomaly detection in VDI process financial data. Dataset contained binary-encoded features from financial transactions requiring classification of anomalous vs. normal samples.

Systematically evaluated three modeling approaches of decreasing complexity:

PCA dimensionality reduction + Random Forest (F1: 22%)
Random Forest with hyperparameter tuning (F1: 57%)
Logistic Regression with default parameters (F1: 100%)

The perfect F1-score achieved with the simplest model demonstrated an important ML principle: always establish baseline performance with simple models before adding complexity. The linearly separable nature of the binary-encoded data made logistic regression the optimal choice, outperforming more sophisticated ensemble methods.

This project reinforced the value of systematic model selection and the importance of matching model complexity to problem complexity.

Link to competition

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
anomaly-dectection-for-financial		anomaly-dectection-for-financial
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Science-Projects

A collection of data science projects to help me practice working with data across diverse domains

About

Uh oh!

Releases

Packages

Languages

jregio/Data-Science-Projects

Folders and files

Latest commit

History

Repository files navigation

Data-Science-Projects

A collection of data science projects to help me practice working with data across diverse domains

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages