first commit

This commit is contained in:
Samir Saci
2021-11-01 23:12:52 +01:00
commit 4bd027b0d7
40 changed files with 6016 additions and 0 deletions

4
.gitignore vendored Executable file
View File

@ -0,0 +1,4 @@
.dist
venv/*
App/*
.notes.txt

BIN
1000lines_35m_3mpng.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

5001
In/df_lines.csv Normal file

File diff suppressed because it is too large Load Diff

21
LICENSE Normal file
View File

@ -0,0 +1,21 @@
MIT License
Copyright (c) 2021 Samir Saci
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

174
README.md Normal file
View File

@ -0,0 +1,174 @@
# Automate ABC Analysis & Product Segmentation with Streamlit 📈
*A statistical methodology to segment your products based on turnover and demand variability using an automated solution with a web application designed with the framework Streamlit*
<p align="center">
<img align="center" src="images/streamlit_capture.PNG" width=75%>
</p>
Product segmentation refers to the activity of grouping products that have similar characteristics and serve a similar market. It is usually related to marketing _(Sales Categories)_ or manufacturing _(Production Processes)_. However as a **Supply Chaine Engineer** your focus is not on the product itself but more on the complexity of managing its flow.
Your want to understand the sales volumes distribution (fast/slow movers) and demand variability to optimize your production, storage and delivery operations to ensure the best service level by considering:
- The highest contribution to your total volume: ABC Analysis
- The most unstable demand: Demand Variability
I have designed this **Streamlit App** to provide a tool to **Supply Chain Engineers** for Product Segmentation, with a focus on retail products, of their portofolio considering the complexity of the demand and the volumes contribution of each item.
### Understand the theory behind 📜
In this [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a), you can find details about the theory used to build this tool.
# Access the application 🖥️
> Access it here: [Product Segmentation for Retail](https://share.streamlit.io/samirsaci/segmentation/main/segmentation.py)
## **Step 0: Why should you use it?**
This Streamlit Web Application has been designed for Supply Chain Engineers to support them in their Inventory Management. It will help you to automate product segmentation using statistics.
## **Step 1: What do you want to do?**
You have two ways to use this application:
- 🖥️ Look at the results computed by the model using the pre-loaded dataset: in that case you just need to scroll to see the visuals and the analyses
OR
- 💾 Upload your dataset of sales records that includes columns related to:
- **Item master data**
_For example: SKU ID, Category, Sub-Category, Store ID_
- **Date of the sales**:
_For example: Day, Week, Month, Year_
- **Quantity or value**: this measure will be used for the ABC analysis
_For example: units, cartons, pallets or euros/dollars/your local currency_
## **Step 2: Prepare the analysis**
### **1. 💾 Upload your dataset of sales records**
<p align="center">
<img align="center" src="images/step_1.PNG" width=40%>
</p>
💡 _Please make sure that you dataset format is csv with a file size lower than 200MB. If you want to increase the size, you'd better copy this repository and deploy the app locally following the instructions below._
### **2. 📅 [Parameters] select the columns for the date (day, week, year) and the values (quantity, $)**
<p align="center">
<img align="center" src="images/step_2.PNG" width=75%>
</p>
💡 _If you have several columns for the date (day, week, month) and for the values (quantity, amount) you can use only one column per category for each run of calculation._
### **3. 📉 [Parameters] select all the columns you want to keep in the analysis**
<p align="center">
<img align="center" src="images/step_3.PNG" width=75%>
</p>
💡 _This step will basically help you to remove the columns that you do not need for your analysis to increase the speed of computation and reduce the usage of ressources._
### **4. 🏬 [Parameters] select all the related to product master data (SKU ID, FAMILIY, CATEGORY, STORE LOCATION)**
<p align="center">
<img align="center" src="images/step_4.PNG" width=75%>
</p>
💡 _In this step you will show at what granularity you want to do your analysis. For example it can be at:_
- _Item, Store level: that means the same item in two stores will represent two SKU_
- _Item ID level: that means you group the sales of your item in all stores_
### **5. 🛍️ [Parameters] select one feature you want to use for analysis by family**
<p align="center">
<img align="center" src="images/step_5.PNG" width=75%>
</p>
💡 _This feature will be used to plot the repartition of (A, B, C) product by family_
### **6. 🖱️ Click on Start Calculation? to launch the analysis**
<p align="center">
<img align="center" src="images/step_6.PNG" width=75%>
</p>
💡 _This feature will be used to plot the repartition of (A, B, C) product by family_
# Get insights about your sales records 💡
### **Pareto Analysis**
<p align="center">
<img align="center" src="images/pareto.PNG" width=75%>
</p>
**INSIGHTS:**
1. How many SKU represent 80% of your total sales?
2. How much sales represent 20% of your SKUs?
_For more information about the theory behind the pareto law and its application in Supply Chain Management: [Pareto Principle for Warehouse Layout Optimization](https://towardsdatascience.com/reduce-warehouse-space-with-the-pareto-principle-using-python-e722a6babe0e)_
### **ABC Analysis with Demand Variability**
<p align="center">
<img align="center" src="images/abc_analysis.PNG" width=75%>
</p>
**QUESTIONS: WHAT IS THE PROPORTION OF?**
1. **LOW IMPORTANCE SKUS**: C references
2. **STABLE DEMAND SKUS**: A and B SKUs with a coefficient of variation below 1
3. **HIGH IMPORTANCE SKUS**: A and B SKUS with a high coefficient of variation
Your inventory management strategies will be impacted by this split:
- A minimum effort should be put in **LOW IMPORTANCE SKUS**
- Automated rules with a moderate attention for **STABLE SKUS**
- Complex replenishment rules and careful attention for **HIGH IMPORTANCE SKUS**
_For more information: [Medium Article](https://towardsdatascience.com/product-segmentation-for-retail-with-python-c85cc0930f9a)_
<p align="center">
<img align="center" src="images/split_category.PNG" width=75%>
</p>
**QUESTIONS:**
1. What is the split of SKUS by FAMILY?
2. What is the split of SKUS by ABC class in each FAMILY?
### **Normality Test**
<p align="center">
<img align="center" src="images/normality.PNG" width=75%>
</p>
**QUESTION:**
- Which SKUs have a sales distribution that follows a normal distribution?
Many inventory rules and safety stock formula can be used only if the sales distribution of your item is following a normal distribution. Thefore, it's better to know the % of your portofolio that can be managed easily.
_For more information: [Inventory Management for Retail — Stochastic Demand](https://towardsdatascience.com/inventory-management-for-retail-stochastic-demand-3020a43d1c14)_
# Build the application locally 🏗️
## **Build a python local environment (recommanded)**
### Then install **virtualenv** using pip3
sudo pip3 install virtualenv
### Now create a virtual environment
virtualenv venv
### Active your virtual environment
source venv/bin/activate
## Launch Streamlit 🚀
### Install all dependencies needed using requirements.txt
pip install -r requirements.txt
### Run the application
streamlit run segmentation.py
### Click on the Network URL in the shell
<p align="center">
<img align="center" src="images/network.PNG" width=50%>
</p>
> -> Enjoy!
# About me 🤓
Senior Supply Chain Engineer with an international experience working on Logistics and Transportation operations. \
Have a look at my portfolio: [Data Science for Supply Chain Portfolio](https://samirsaci.com) \
Data Science for Warehousing📦, Transportation 🚚 and Demand Forecasting 📈

124
app.py Executable file
View File

@ -0,0 +1,124 @@
import pandas as pd
import numpy as np
import plotly.express as px
from utils.routing.distances import (
distance_picking,
next_location
)
from utils.routing.routes import (
create_picking_route
)
from utils.batch.mapping_batch import (
orderlines_mapping,
locations_listing
)
from utils.cluster.mapping_cluster import (
df_mapping
)
from utils.batch.simulation_batch import (
simulation_wave,
simulate_batch
)
from utils.cluster.simulation_cluster import(
loop_wave,
simulation_cluster,
create_dataframe,
process_methods
)
from utils.results.plot import (
plot_simulation1,
plot_simulation2
)
import streamlit as st
from streamlit import caching
# Set page configuration
st.set_page_config(page_title ="Improve Warehouse Productivity using Order Batching",
initial_sidebar_state="expanded",
layout='wide',
page_icon="🛒")
# Set up the page
@st.cache(persist=False,
allow_output_mutation=True,
suppress_st_warning=True,
show_spinner= True)
# Preparation of data
def load(filename, n):
df_orderlines = pd.read_csv(IN + filename).head(n)
return df_orderlines
# Alley Coordinates on y-axis
y_low, y_high = 5.5, 50
# Origin Location
origin_loc = [0, y_low]
# Distance Threshold (m)
distance_threshold = 35
distance_list = [1] + [i for i in range(5, 100, 5)]
IN = 'In/'
# Store Results by WaveID
list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult = [], [], [], [], [], [], []
list_results = [list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult] # Group in list
# Store Results by Simulation (Order_number)
list_ordnum , list_dstw = [], []
# Simulation 1: Order Batch
# SCOPE SIZE
st.header("**🥇 Impact of the wave size in orders (Orders/Wave) **")
st.subheader('''
🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
''')
col1, col2 = st.beta_columns(2)
with col1:
n = st.slider(
'SIMULATION 1 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
with col2:
lines_number = 1000 * n
st.write('''🛠️{:,} \
order lines'''.format(lines_number))
# SIMULATION PARAMETERS
st.subheader('''
🛠️ SIMULATE ORDER PICKING BY WAVE OF N ORDERS PER WAVE WITH N IN [N_MIN, N_MAX] ''')
col_11 , col_22 = st.beta_columns(2)
with col_11:
n1 = st.slider(
'SIMULATION 1: N_MIN (ORDERS/WAVE)', 0, 20 , value = 1)
n2 = st.slider(
'SIMULATION 1: N_MAX (ORDERS/WAVE)', n1 + 1, 20 , value = int(np.max([n1+1 , 10])))
with col_22:
st.write('''[N_MIN, N_MAX] = [{:,}, {:,}]'''.format(n1, n2))
# START CALCULATION
start_1= False
if st.checkbox('SIMULATION 1: START CALCULATION',key='show', value=False):
start_1 = True
# Calculation
if start_1:
df_orderlines = load('df_lines.csv', lines_number)
df_waves, df_results = simulate_batch(n1, n2, y_low, y_high, origin_loc, lines_number, df_orderlines)
plot_simulation1(df_results, lines_number)
# Simulation 2: Order Batch using Spatial Clustering
# SCOPE SIZE
st.header("**🥈 Impact of the wave size in orders (Orders/Wave) **")
st.subheader('''
🛠️ HOW MANY ORDER LINES DO YOU WANT TO INCLUDE IN YOUR ANALYSIS?
''')
col1, col2 = st.beta_columns(2)
with col1:
n_ = st.slider(
'SIMULATION 2 SCOPE (THOUSDAND ORDERS)', 1, 200 , value = 5)
with col2:
lines_2 = 1000 * n_
st.write('''🛠️{:,} \
order lines'''.format(lines_number))
# START CALCULATION
start_2 = False
if st.checkbox('SIMULATION 2: START CALCULATION',key='show_2', value=False):
start_2 = True
# Calculation
if start_2:
df_orderlines = load('df_lines.csv', lines_2)
df_reswave, df_results = simulation_cluster(y_low, y_high, df_orderlines, list_results, n1, n2,
distance_threshold)
plot_simulation2(df_reswave, lines_2, distance_threshold)

41
notes.txt Normal file
View File

@ -0,0 +1,41 @@
# Example Artefact
https://github.com/MaximeLutel/streamlit_prophet
https://streamlit.io/gallery?category=model-building-training
- The Math of the Prophet
https://medium.com/future-vision/the-math-of-prophet-46864fa9c55a
# INSTALL NODE
https://docs.microsoft.com/fr-fr/windows/dev-environment/javascript/nodejs-on-wsl
# Ubuntu WSL VS Code
https://code.visualstudio.com/docs/remote/wsl
- Donner les droits admins pour ecrire et dl des librairies
sudo chown -R samirs streamlit_prophet
# Move local directory of Windows to Local Linux
mkdir app
cp -R /mnt/c/Data/62-\ Projects/24-\ Articles/25-\ Improve\ Warehouse\ Productivity/App ~/app
cd ~/App
code .
# Github
git config --global user.email "samir.saci@outlook.com"
git config --global user.name "Samir Saci"
git remote add origin 'https://github.com/samirsaci/segmentation.git'
git push -u origin main
# Install pipenv
pip install virtualenv
python3.8 -m virtualenv venv
source venv/bin/activate
# Activate Streamlit
streamlit run segmentation.py --server.address 0.0.0.0
streamlit run app.py --server.address 0.0.0.0
# SEGMENTATION TO DO
1) FAMILY = F(SKU SCOPE)
2) ITEM = ITEM LIST - FAMILY
C:\Data\62- Projects\24- Articles\25- Improve Warehouse Productivity\App

93
requirements.txt Normal file
View File

@ -0,0 +1,93 @@
absl-py==0.15.0
altair==4.1.0
argon2-cffi==21.1.0
astor==0.8.1
attrs==21.2.0
backcall==0.2.0
backports.zoneinfo==0.2.1
base58==2.1.1
bleach==4.1.0
blinker==1.4
cachetools==4.2.4
certifi==2021.10.8
cffi==1.15.0
charset-normalizer==2.0.7
click==8.0.3
cycler==0.11.0
debugpy==1.5.1
decorator==5.1.0
defusedxml==0.7.1
entrypoints==0.3
et-xmlfile==1.1.0
gitdb==4.0.9
GitPython==3.1.24
idna==3.3
ipykernel==6.4.2
ipython==7.29.0
ipython-genutils==0.2.0
ipywidgets==7.6.5
jedi==0.18.0
Jinja2==3.0.2
jsonschema==4.1.2
jupyter-client==7.0.6
jupyter-core==4.9.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.2
kiwisolver==1.3.2
MarkupSafe==2.0.1
matplotlib==3.4.3
matplotlib-inline==0.1.3
mistune==0.8.4
nbclient==0.5.4
nbconvert==6.2.0
nbformat==5.1.3
nest-asyncio==1.5.1
notebook==6.4.5
numpy==1.21.3
openpyxl==3.0.9
ortools==9.1.9490
packaging==21.2
pandas==1.3.4
pandocfilters==1.5.0
parso==0.8.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.4.0
plotly==5.3.1
prometheus-client==0.12.0
prompt-toolkit==3.0.21
protobuf==3.19.1
ptyprocess==0.7.0
pyarrow==6.0.0
pycparser==2.20
pydeck==0.7.1
Pygments==2.10.0
pyparsing==2.4.7
pyrsistent==0.18.0
python-dateutil==2.8.2
pytz==2021.3
pytz-deprecation-shim==0.1.0.post0
PyYAML==6.0
pyzmq==22.3.0
requests==2.26.0
scipy==1.7.1
Send2Trash==1.8.0
six==1.16.0
smmap==5.0.0
streamlit==0.77.0
tenacity==8.0.1
terminado==0.12.1
testpath==0.5.0
toml==0.10.2
toolz==0.11.1
tornado==6.1
traitlets==5.1.1
typing-extensions==3.10.0.2
tzdata==2021.5
tzlocal==4.1
urllib3==1.26.7
validators==0.18.2
watchdog==2.1.6
wcwidth==0.2.5
webencodings==0.5.1
widgetsnbextension==3.5.2

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,30 @@
import numpy as np
import pandas as pd
import itertools
from ast import literal_eval
def orderlines_mapping(df_orderlines, orders_number):
'''Mapping orders with wave number'''
df_orderlines.sort_values(by='DATE', ascending = True, inplace = True)
# Unique order numbers list
list_orders = df_orderlines.OrderNumber.unique()
dict_map = dict(zip(list_orders, [i for i in range(1, len(list_orders))]))
# Order ID mapping
df_orderlines['OrderID'] = df_orderlines['OrderNumber'].map(dict_map)
# Grouping Orders by Wave of orders_number
df_orderlines['WaveID'] = (df_orderlines.OrderID%orders_number == 0).shift(1).fillna(0).cumsum()
# Counting number of Waves
waves_number = df_orderlines.WaveID.max() + 1
return df_orderlines, waves_number
def locations_listing(df_orderlines, wave_id):
'''Getting storage locations to cover for a wave of orders'''
df = df_orderlines[df_orderlines.WaveID == wave_id]
# Create coordinates listing
list_locs = list(df['Coord'].apply(lambda t: literal_eval(t)).values)
list_locs.sort()
# List of unique coordinates
list_locs = list(k for k,_ in itertools.groupby(list_locs))
n_locs = len(list_locs)
return list_locs, n_locs

View File

@ -0,0 +1,41 @@
from utils.batch.mapping_batch import *
from utils.cluster.mapping_cluster import *
from utils.routing.routes import *
def simulation_wave(y_low, y_high, origin_loc, orders_number, df_orderlines, list_wid, list_dst, list_route, list_ord):
''' Simulate of total picking distance with n orders per wave'''
distance_route = 0
# Create waves
df_orderlines, waves_number = orderlines_mapping(df_orderlines, orders_number)
for wave_id in range(waves_number):
# Listing of all locations for this wave
list_locs, n_locs, n_lines, n_pcs = locations_listing(df_orderlines, wave_id)
# Results
wave_distance, list_chemin = create_picking_route(origin_loc, list_locs, y_low, y_high)
distance_route = distance_route + wave_distance
list_wid.append(wave_id)
list_dst.append(wave_distance)
list_route.append(list_chemin)
list_ord.append(orders_number)
return list_wid, list_dst, list_route, list_ord, distance_route
def simulate_batch(n1, n2, y_low, y_high, origin_loc, orders_number, df_orderlines):
''' Loop with several scenarios of n orders per wave'''
# Lists for results
list_wid, list_dst, list_route, list_ord = [], [], [], []
# Test several values of orders per wave
for orders_number in range(n1, n2 + 1):
list_wid, list_dst, list_route, list_ord, distance_route = simulation_wave(y_low, y_high, origin_loc, orders_number,
df_orderlines, list_wid, list_dst, list_route, list_ord)
print("Total distance covered for {} orders/wave: {:,} m".format(orders_number, distance_route))
# By Wave
df_waves = pd.DataFrame({'wave': list_wid,
'distance': list_dst,
'routes': list_route,
'order_per_wave': list_ord})
# Results aggregate
df_results = pd.DataFrame(df_waves.groupby(['order_per_wave'])['distance'].sum())
df_results.columns = ['distance']
return df_waves, df_results.reset_index()

Binary file not shown.

View File

@ -0,0 +1,98 @@
import numpy as np
import pandas as pd
import itertools
from ast import literal_eval
import matplotlib.pyplot as plt
from scipy.cluster.vq import kmeans2, whiten
from scipy.spatial.distance import pdist
from scipy.cluster.hierarchy import ward, fcluster
from utils.routing.distances import *
def cluster_locations(list_coord, distance_threshold, dist_method, clust_start):
''' Step 1: Create clusters of locations'''
# Create linkage matrix
if dist_method == 'euclidian':
Z = ward(pdist(np.stack(list_coord)))
else:
Z = ward(pdist(np.stack(list_coord), metric = distance_picking_cluster))
# Single cluster array
fclust1 = fcluster(Z, t = distance_threshold, criterion = 'distance')
return fclust1
def clustering_mapping(df, distance_threshold, dist_method, orders_number, wave_start, clust_start, df_type): # clustering_loc
'''Step 2: Clustering and mapping'''
# 1. Create Clusters
list_coord, list_OrderNumber, clust_id, df = cluster_wave(df, distance_threshold, 'custom', clust_start, df_type)
clust_idmax = max(clust_id) # Last Cluster ID
# 2. Mapping Order lines
dict_map, dict_omap, df, Wave_max = lines_mapping_clst(df, list_coord, list_OrderNumber, clust_id, orders_number, wave_start)
return dict_map, dict_omap, df, Wave_max, clust_idmax
def cluster_wave(df, distance_threshold, dist_method, clust_start, df_type):
'''Step 3: Create waves by clusters'''
# Create Column for Clustering
if df_type == 'df_mono':
df['Coord_Cluster'] = df['Coord']
# Mapping points
df_map = pd.DataFrame(df.groupby(['OrderNumber', 'Coord_Cluster'])['SKU'].count()).reset_index() # Here we use Coord Cluster
list_coord, list_OrderNumber = np.stack(df_map.Coord_Cluster.apply(lambda t: literal_eval(t)).values), df_map.OrderNumber.values
# Cluster picking locations
clust_id = cluster_locations(list_coord, distance_threshold, dist_method, clust_start)
clust_id = [(i + clust_start) for i in clust_id]
# List_coord
list_coord = np.stack(list_coord)
return list_coord, list_OrderNumber, clust_id, df
def lines_mapping(df, orders_number, wave_start):
'''Step 4: Mapping Order lines mapping without clustering '''
# Unique order numbers list
list_orders = df.OrderNumber.unique()
# Dictionnary for mapping
dict_map = dict(zip(list_orders, [i for i in range(1, len(list_orders))]))
# Order ID mapping
df['OrderID'] = df['OrderNumber'].map(dict_map)
# Grouping Orders by Wave of orders_number
df['WaveID'] = (df.OrderID%orders_number == 0).shift(1).fillna(0).cumsum() + wave_start
# Counting number of Waves
waves_number = df.WaveID.max() + 1
return df, waves_number
def lines_mapping_clst(df, list_coord, list_OrderNumber, clust_id, orders_number, wave_start):
'''Step 4: Mapping Order lines mapping with clustering '''
# Dictionnary for mapping by cluster
dict_map = dict(zip(list_OrderNumber, clust_id))
# Dataframe mapping
df['ClusterID'] = df['OrderNumber'].map(dict_map)
# Order by ID and mapping
df = df.sort_values(['ClusterID','OrderNumber'], ascending = True)
list_orders = list(df.OrderNumber.unique())
# Dictionnary for order mapping
dict_omap = dict(zip(list_orders, [i for i in range(1, len(list_orders))]))
# Order ID mapping
df['OrderID'] = df['OrderNumber'].map(dict_omap)
# Create Waves: Increment when reaching orders_number or changing cluster
df['WaveID'] = wave_start + ((df.OrderID%orders_number == 0) | (df.ClusterID.diff() != 0)).shift(1).fillna(0).cumsum()
wave_max = df.WaveID.max()
return dict_map, dict_omap, df, wave_max
def locations_listing(df_orderlines, wave_id):
''' Step 5: Listing location per Wave of orders'''
# Filter by wave_id
df = df_orderlines[df_orderlines.WaveID == wave_id]
# Create coordinates listing
list_coord = list(df['Coord'].apply(lambda t: literal_eval(t)).values) # Here we use Coord for distance
list_coord.sort()
# Get unique Unique coordinates
list_coord = list(k for k,_ in itertools.groupby(list_coord))
n_locs = len(list_coord)
n_lines = len(df)
n_pcs = df.PCS.sum()
return list_coord, n_locs, n_lines, n_pcs

View File

@ -0,0 +1,37 @@
from utils.cluster.clustering import *
from utils.process.processing import *
from utils.routing.distances import *
def df_mapping(df_orderlines, orders_number, distance_threshold, mono_method, multi_method):
''' Mapping Order lines Dataframe using clustering'''
# Filter mono and multi orders
df_mono, df_multi = process_lines(df_orderlines)
wave_start = 0
clust_start = 0
# Mapping for single line orders
if mono_method == 'clustering':
df_type = 'df_mono'
dict_map, dict_omap, df_mono, waves_number, clust_idmax = clustering_mapping(df_mono, distance_threshold, 'custom',
orders_number, wave_start, clust_start, df_type)
else:
df_mono, waves_number = lines_mapping(df_mono, orders_number, 0)
clust_idmax = 0
# => Wave_start
wave_start = waves_number
clust_start = clust_idmax
# Mapping for multi line orders
if multi_method == 'clustering':
df_type = 'df_multi'
df_multi = centroid_mapping(df_multi)
dict_map, dict_omap, df_multi, waves_number, clust_idmax = clustering_mapping(df_multi, distance_threshold, 'custom',
orders_number, wave_start, clust_start, df_type)
else:
df_multi, waves_number = lines_mapping(df_multi, orders_number, wave_start)
# Final Concatenation
df_orderlines, waves_number = monomult_concat(df_mono, df_multi)
return df_orderlines, waves_number

View File

@ -0,0 +1,150 @@
from utils.cluster.mapping_cluster import *
from utils.routing.routes import *
# Function
def simulation_wave(y_low, y_high, orders_number, df_orderlines, list_results, distance_threshold, mono_method, multi_method):
''' Simulate the distance for a number of orders per wave'''
# List to store values
[list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult] = [list_results[i] for i in range(len(list_results))]
# Variables to store total distance
distance_route = 0
origin_loc = [0, y_low]
# Mapping of orderlines with waves number
df_orderlines, waves_number = df_mapping(df_orderlines, orders_number, distance_threshold, mono_method, multi_method)
# Loop
for wave_id in range(waves_number):
# Listing of all locations for this wave
list_locs, n_locs, n_lines, n_pcs = locations_listing(df_orderlines, wave_id)
# Create picking route
wave_distance, list_chemin, distance_max = create_picking_route_cluster(origin_loc, list_locs, y_low, y_high)
# Total walking distance
distance_route = distance_route + wave_distance
# Results by wave
monomult = mono_method + '-' + multi_method
# Add the results
list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult = append_results(list_wid, list_dst, list_route, list_ord, list_lines,
list_pcs, list_monomult, wave_id, wave_distance, list_chemin, orders_number, n_lines, n_pcs, monomult)
# List results
list_results = [list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult]
return list_results, distance_route
def loop_wave(y_low, y_high, df_orderlines, list_results, n1, n2, distance_threshold, mono_method, multi_method):
''' Simulate all scenarios for each number of orders per wave'''
# Lists for records
list_ordnum, list_dstw = [], []
lines_number = len(df_orderlines)
# Test several values of orders per wave
for orders_number in range(n1, n2):
# Scenario of orders/wave = orders_number
list_results, distance_route = simulation_wave(y_low, y_high, orders_number, df_orderlines, list_results,
distance_threshold, mono_method, multi_method)
# Append results per Wave
list_ordnum.append(orders_number)
list_dstw.append(distance_route)
print("{} orders/wave: {:,} m".format(orders_number, distance_route))
# Output list
[list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult] = [list_results[i] for i in range(len(list_results))]
# Output results per wave
df_results, df_reswave = create_dataframe(list_wid, list_dst, list_route, list_ord,
distance_route, list_lines, list_pcs, list_monomult, list_ordnum, list_dstw)
return list_results, df_reswave
def simulation_cluster(y_low, y_high, df_orderlines, list_results, n1, n2, distance_threshold):
'''Simulate for three scenarios'''
# Loop_wave: Simulation 1
mono_method, multi_method = 'normal', 'normal'
list_results, df_reswave1 = loop_wave(y_low, y_high, df_orderlines, list_results, n1, n2,
distance_threshold, mono_method, multi_method)
# Loop_wave: Simulation 2
mono_method, multi_method = 'clustering', 'normal'
list_results, df_reswave2 = loop_wave(y_low, y_high, df_orderlines, list_results, n1, n2,
distance_threshold, mono_method, multi_method)
# Loop_wave: Simulation 3
mono_method, multi_method = 'clustering', 'clustering'
list_results, df_reswave3 = loop_wave(y_low, y_high, df_orderlines, list_results, n1, n2,
distance_threshold, mono_method, multi_method)
# Expand
[list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult] = [list_results[i] for i in range(len(list_results))]
lines_number = len(df_orderlines)
# Results
df_results = pd.DataFrame({'wave_number': list_wid,
'distance': list_dst,
'chemins': list_route,
'order_per_wave': list_ord,
'lines': list_lines,
'pcs': list_pcs,
'mono_multi':list_monomult})
# Final Processing
df_reswave = process_methods(df_reswave1, df_reswave2, df_reswave3, lines_number, distance_threshold)
return df_reswave, df_results
def create_dataframe(list_wid, list_dst, list_route, list_ord, distance_route, list_lines, list_pcs, list_monomult, list_ordnum, list_dstw):
''' Create Dataframes of results'''
# Results by Wave df
df_results = pd.DataFrame({'wave_number': list_wid,
'distance': list_dst,
'chemin': list_route,
'orders_per_wave': list_ord,
'lines': list_lines,
'pcs': list_pcs,
'mono_multi':list_monomult})
# Results by Wave_ID
df_reswave = pd.DataFrame({
'orders_number': list_ordnum,
'distance': list_dstw
})
return df_results, df_reswave
# Append Results
def append_results(list_wid, list_dst, list_route, list_ord, list_lines,
list_pcs, list_monomult, wave_id, wave_distance, list_chemin, orders_number, n_lines, n_pcs, monomult):
list_wid.append(wave_id)
list_dst.append(wave_distance)
list_route.append(list_chemin)
list_ord.append(orders_number)
list_lines.append(n_lines)
list_pcs.append(n_pcs)
list_monomult.append(monomult)
return list_wid, list_dst, list_route, list_ord, list_lines, list_pcs, list_monomult
def process_methods(df_reswave1, df_reswave2, df_reswave3, lines_number, distance_threshold):
''' Process the results of three methods'''
# Concatenate two dataframes for plot
df_reswave1.rename(columns={"distance": "distance_method_1"}, inplace = True)
df_reswave2.rename(columns={"distance": "distance_method_2"}, inplace = True)
df_reswave3.rename(columns={"distance": "distance_method_3"}, inplace = True)
df_reswave = df_reswave1.set_index('orders_number')
# Rename columns
df_reswave['distance_method_2'] = df_reswave2.set_index('orders_number')['distance_method_2']
df_reswave['distance_method_3'] = df_reswave3.set_index('orders_number')['distance_method_3']
df_reswave.reset_index().plot.bar(x = 'orders_number', y = ['distance_method_1', 'distance_method_2', 'distance_method_3'],
figsize=(10, 6), color = ['black', 'red', 'blue'])
plt.title("Picking Route Distance for {:,} Order lines / {} m distance threshold".format(lines_number, distance_threshold))
plt.ylabel('Walking Distance (m)')
plt.xlabel('Orders per Wave (Orders/Wave)')
plt.savefig("{}lines_{}m_3mpng".format(lines_number, distance_threshold))
plt.show()
return df_reswave

Binary file not shown.

View File

@ -0,0 +1,31 @@
import pandas as pd
def process_lines(df_orderlines):
''' Processing of dataframe '''
# Mapping Order lines
df_nline = pd.DataFrame(df_orderlines.groupby(['OrderNumber'])['SKU'].count())
# Lists
list_ord = list(df_nline.index.astype(int).values)
list_lines = list(df_nline['SKU'].values.astype(int))
# Mapping
dict_nline = dict(zip(list_ord, list_lines))
df_orderlines['N_lines'] = df_orderlines['OrderNumber'].map(dict_nline)
# Processing
df_mono, df_multi = df_orderlines[df_orderlines['N_lines'] == 1], df_orderlines[df_orderlines['N_lines'] > 1]
del df_orderlines
return df_mono, df_multi
def monomult_concat(df_mono, df_multi):
''' Concat mono-line and multi-lines orders'''
# Original Coordinate for mono
df_mono['Coord_Cluster'] = df_mono['Coord']
# Dataframe Concatenation
df_orderlines = pd.concat([df_mono, df_multi])
# Counting number of Waves
waves_number = df_orderlines.WaveID.max() + 1
return df_orderlines, waves_number

Binary file not shown.

31
utils/results/plot.py Normal file
View File

@ -0,0 +1,31 @@
import matplotlib.pyplot as plt
import plotly.express as px
import streamlit as st
def plot_simulation1(df_results, lines_number):
''' Plot simulation of batch size'''
fig = px.bar(data_frame=df_results,
width=1200,
height=600,
x = 'order_per_wave',
y = 'distance',
labels={
'order_per_wave': 'Wave size (Orders/Wave)',
'distance': 'Total Picking Walking Distance (m)'})
fig.update_traces(marker_line_width=1,marker_line_color="black")
st.write(fig)
def plot_simulation2(df_reswave, lines_number, distance_threshold):
fig = px.bar(data_frame=df_reswave.reset_index(),
width=1200,
height=600,
x = 'orders_number',
y = ['distance_method_1', 'distance_method_2', 'distance_method_3'],
labels={
'orders_number': 'Wave size (Orders/Wave)',
'distance_method_1': 'NO CLUSTERING APPLIED',
'distance_method_2': 'CLUSTERING ON SINGLE LINE ORDERS',
'distance_method_3': 'CLUSTERING ON SINGLE LINE AND CENTROID FOR MULTI LINE'}, barmode = "group")
fig.update_traces(marker_line_width=1, marker_line_color="black")
st.write(fig)

Binary file not shown.

Binary file not shown.

View File

@ -0,0 +1,84 @@
import numpy as np
import pandas as pd
import ast
from ast import literal_eval
def distance_picking(Loc1, Loc2, y_low, y_high):
'''Calculate Picker Route Distance between two locations'''
# Start Point
x1, y1 = Loc1[0], Loc1[1]
# End Point
x2, y2 = Loc2[0], Loc2[1]
# Distance x-axis
distance_x = abs(x2 - x1)
# Distance y-axis
if x1 == x2:
distance_y1 = abs(y2 - y1)
distance_y2 = distance_y1
else:
distance_y1 = (y_high - y1) + (y_high - y2)
distance_y2 = (y1 - y_low) + (y2 - y_low)
# Minimum distance on y-axis
distance_y = min(distance_y1, distance_y2)
# Total distance
distance = distance_x + distance_y
return int(distance)
def next_location(start_loc, list_locs, y_low, y_high):
'''Find closest next location'''
# Distance to every next points candidate
list_dist = [distance_picking(start_loc, i, y_low, y_high) for i in list_locs]
# Minimum Distance
distance_next = min(list_dist)
# Location of minimum distance
index_min = list_dist.index(min(list_dist))
next_loc = list_locs[index_min]
list_locs.remove(next_loc)
return list_locs, start_loc, next_loc, distance_next
def centroid(list_in):
'''Centroid function'''
x, y = [p[0] for p in list_in], [p[1] for p in list_in]
centroid = [round(sum(x) / len(list_in),2), round(sum(y) / len(list_in), 2)]
return centroid
def centroid_mapping(df_multi):
'''Mapping Centroids'''
# Mapping multi
df_multi['Coord'] = df_multi['Coord'].apply(literal_eval)
# Group coordinates per order
df_group = pd.DataFrame(df_multi.groupby(['OrderNumber'])['Coord'].apply(list)).reset_index()
# Calculate Centroid
df_group['Coord_Centroid'] = df_group['Coord'].apply(centroid)
# Dictionnary for mapping
list_order, list_coord = list(df_group.OrderNumber.values), list(df_group.Coord_Centroid.values)
dict_coord = dict(zip(list_order, list_coord))
# Final mapping
df_multi['Coord_Cluster'] = df_multi['OrderNumber'].map(dict_coord).astype(str)
df_multi['Coord'] = df_multi['Coord'].astype(str)
return df_multi
def distance_picking_cluster(point1, point2):
y_low, y_high = 5.5, 50
# Start Point
x1, y1 = point1[0], point1[1]
# End Point
x2, y2 = point2[0], point2[1]
# Distance x-axis
distance_x = abs(x2 - x1)
# Distance y-axis
if x1 == x2:
distance_y1 = abs(y2 - y1)
distance_y2 = distance_y1
else:
distance_y1 = (y_high - y1) + (y_high - y2)
distance_y2 = (y1 - y_low) + (y2 - y_low)
# Minimum distance on y-axis
distance_y = min(distance_y1, distance_y2)
# Total distance
distance = distance_x + distance_y
return distance

56
utils/routing/routes.py Normal file
View File

@ -0,0 +1,56 @@
import pandas as pd
import numpy as np
import itertools
from ast import literal_eval
from utils.routing.distances import *
def create_picking_route(origin_loc, list_locs, y_low, y_high):
'''Calculate total distance to cover for a list of locations'''
# Total distance variable
wave_distance = 0
# Current location variable
start_loc = origin_loc
# Store routes
list_chemin = []
list_chemin.append(start_loc)
while len(list_locs) > 0: # Looping until all locations are picked
# Going to next location
list_locs, start_loc, next_loc, distance_next = next_location(start_loc, list_locs, y_low, y_high)
# Update start_loc
start_loc = next_loc
list_chemin.append(start_loc)
# Update distance
wave_distance = wave_distance + distance_next
# Final distance from last storage location to origin
wave_distance = wave_distance + distance_picking(start_loc, origin_loc, y_low, y_high)
list_chemin.append(origin_loc)
return wave_distance, list_chemin
# Calculate total distance to cover for a list of locations
def create_picking_route_cluster(origin_loc, list_locs, y_low, y_high):
# Total distance variable
wave_distance = 0
# Distance max
distance_max = 0
# Current location variable
start_loc = origin_loc
# Store routes
list_chemin = []
list_chemin.append(start_loc)
while len(list_locs) > 0: # Looping until all locations are picked
# Going to next location
list_locs, start_loc, next_loc, distance_next = next_location(start_loc, list_locs, y_low, y_high)
# Update start_loc
start_loc = next_loc
list_chemin.append(start_loc)
if distance_next > distance_max:
distance_max = distance_next
# Update distance
wave_distance = wave_distance + distance_next
# Final distance from last storage location to origin
wave_distance = wave_distance + distance_picking(start_loc, origin_loc, y_low, y_high)
list_chemin.append(origin_loc)
return wave_distance, list_chemin, distance_max