[pre-commit.ci] pre-commit autoupdate (#11322)

* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.2.2 → v0.3.2](https://github.com/astral-sh/ruff-pre-commit/compare/v0.2.2...v0.3.2)
- [github.com/pre-commit/mirrors-mypy: v1.8.0 → v1.9.0](https://github.com/pre-commit/mirrors-mypy/compare/v1.8.0...v1.9.0)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
pre-commit-ci[bot]
2024-03-13 07:52:41 +01:00
committed by GitHub
parent 5f95d6f805
commit bc8df6de31
297 changed files with 488 additions and 285 deletions

View File

@ -10,6 +10,7 @@ indicating that customers who purchased A and B are more likely to also purchase
WIKI: https://en.wikipedia.org/wiki/Apriori_algorithm
Examples: https://www.kaggle.com/code/earthian/apriori-association-rules-mining
"""
from itertools import combinations

View File

@ -12,6 +12,7 @@ reason, A* is known as an algorithm with brains.
https://en.wikipedia.org/wiki/A*_search_algorithm
"""
import numpy as np

View File

@ -6,6 +6,7 @@ Reference: https://en.wikipedia.org/wiki/Automatic_differentiation
Author: Poojan Smart
Email: smrtpoojan@gmail.com
"""
from __future__ import annotations
from collections import defaultdict

View File

@ -25,6 +25,7 @@ Additionally, a few rules of thumb are:
2. non-gaussian (non-normal) distributions work better with normalization
3. If a column or list of values has extreme values / outliers, use standardization
"""
from statistics import mean, stdev

View File

@ -3,6 +3,7 @@ Implementation of a basic regression decision tree.
Input data set: The input data set must be 1-dimensional with continuous labels.
Output: The decision tree maps a real number input to a real number output.
"""
import numpy as np

View File

@ -9,6 +9,7 @@ WIKI: https://athena.ecs.csus.edu/~mei/associationcw/FpGrowth.html
Examples: https://www.javatpoint.com/fp-growth-algorithm-in-data-mining
"""
from __future__ import annotations
from dataclasses import dataclass, field

View File

@ -2,6 +2,7 @@
Implementation of gradient descent algorithm for minimizing cost of a linear hypothesis
function.
"""
import numpy
# List of input, output pairs

View File

@ -40,6 +40,7 @@ Usage:
5. Transfers Dataframe into excel format it must have feature called
'Clust' with k means clustering numbers in it.
"""
import warnings
import numpy as np

View File

@ -1,47 +1,48 @@
"""
Linear Discriminant Analysis
Linear Discriminant Analysis
Assumptions About Data :
1. The input variables has a gaussian distribution.
2. The variance calculated for each input variables by class grouping is the
same.
3. The mix of classes in your training set is representative of the problem.
Assumptions About Data :
1. The input variables has a gaussian distribution.
2. The variance calculated for each input variables by class grouping is the
same.
3. The mix of classes in your training set is representative of the problem.
Learning The Model :
The LDA model requires the estimation of statistics from the training data :
1. Mean of each input value for each class.
2. Probability of an instance belong to each class.
3. Covariance for the input data for each class
Learning The Model :
The LDA model requires the estimation of statistics from the training data :
1. Mean of each input value for each class.
2. Probability of an instance belong to each class.
3. Covariance for the input data for each class
Calculate the class means :
mean(x) = 1/n ( for i = 1 to i = n --> sum(xi))
Calculate the class means :
mean(x) = 1/n ( for i = 1 to i = n --> sum(xi))
Calculate the class probabilities :
P(y = 0) = count(y = 0) / (count(y = 0) + count(y = 1))
P(y = 1) = count(y = 1) / (count(y = 0) + count(y = 1))
Calculate the class probabilities :
P(y = 0) = count(y = 0) / (count(y = 0) + count(y = 1))
P(y = 1) = count(y = 1) / (count(y = 0) + count(y = 1))
Calculate the variance :
We can calculate the variance for dataset in two steps :
1. Calculate the squared difference for each input variable from the
group mean.
2. Calculate the mean of the squared difference.
------------------------------------------------
Squared_Difference = (x - mean(k)) ** 2
Variance = (1 / (count(x) - count(classes))) *
(for i = 1 to i = n --> sum(Squared_Difference(xi)))
Calculate the variance :
We can calculate the variance for dataset in two steps :
1. Calculate the squared difference for each input variable from the
group mean.
2. Calculate the mean of the squared difference.
------------------------------------------------
Squared_Difference = (x - mean(k)) ** 2
Variance = (1 / (count(x) - count(classes))) *
(for i = 1 to i = n --> sum(Squared_Difference(xi)))
Making Predictions :
discriminant(x) = x * (mean / variance) -
((mean ** 2) / (2 * variance)) + Ln(probability)
---------------------------------------------------------------------------
After calculating the discriminant value for each class, the class with the
largest discriminant value is taken as the prediction.
Making Predictions :
discriminant(x) = x * (mean / variance) -
((mean ** 2) / (2 * variance)) + Ln(probability)
---------------------------------------------------------------------------
After calculating the discriminant value for each class, the class with the
largest discriminant value is taken as the prediction.
Author: @EverLookNeverSee
Author: @EverLookNeverSee
"""
from collections.abc import Callable
from math import log
from os import name, system

View File

@ -7,6 +7,7 @@ We try to set the weight of these features, over many iterations, so that they b
fit our dataset. In this particular code, I had used a CSGO dataset (ADR vs
Rating). We try to best fit a line through dataset and estimate the parameters.
"""
import numpy as np
import requests

View File

@ -14,6 +14,7 @@ Helpful resources:
Coursera ML course
https://medium.com/@martinpella/logistic-regression-from-scratch-in-python-124c5636b8ac
"""
import numpy as np
from matplotlib import pyplot as plt
from sklearn import datasets

View File

@ -1,9 +1,10 @@
"""
Create a Long Short Term Memory (LSTM) network model
An LSTM is a type of Recurrent Neural Network (RNN) as discussed at:
* https://colah.github.io/posts/2015-08-Understanding-LSTMs
* https://en.wikipedia.org/wiki/Long_short-term_memory
Create a Long Short Term Memory (LSTM) network model
An LSTM is a type of Recurrent Neural Network (RNN) as discussed at:
* https://colah.github.io/posts/2015-08-Understanding-LSTMs
* https://en.wikipedia.org/wiki/Long_short-term_memory
"""
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

View File

@ -57,7 +57,6 @@ References:
Author: Amir Lavasani
"""
import logging
import numpy as np

View File

@ -1,6 +1,7 @@
"""
https://en.wikipedia.org/wiki/Self-organizing_map
"""
import math

View File

@ -30,7 +30,6 @@ Reference:
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf
"""
import os
import sys
import urllib.request

View File

@ -7,6 +7,7 @@ returns a list containing two data for each vector:
1. the nearest vector
2. distance between the vector and the nearest vector (float)
"""
from __future__ import annotations
import math