There is a case that, evenif the order of data has a meaning, it is not in the data because it is meta knowledge .
But if we add the meaning in the data, our output of analysis could be rich.
Image of this activity by Excel is not difficult because we can put the meaning on the data directly using functions. But I was in trouble when I wanted to do this activity using Python.
This page is the memo using Python.
Codes in this page start the phase that there is an input data named "df".
If start data is a csv file. Code to make "df" is bellow.
import pandas as pd # Read package
df= pd.read_csv("Data.csv")# Read data
This example has limit "X1 = 4" to decide "OK" or "NG".
In Excel, I use IF-function for this process.
df['C2']=df.X1.apply(lambda x:'OK' if x <4 else 'NG')
Add order by same category.
After this process, we can do
Passing Analysis
.
In Excel, I use IF-function in complicated use.
In the manufacturing industry, this category are "Lot" or "Batch".
df['X2']=df.groupby('C1').cumcount()+1
Without the last "+1", orders start from "0".
If we use sets of categories, the code is bellow.
df['X2']=df.groupby(['C1','C2']).cumcount()+1
There is the case that repeats of orders means the cycle.
In the manufacturing industry, this cycle are "Lot" or "Batch".
If we want to analyze the difference of lots or batchs, we need group variable.
df['X3'] = (df['X2'] == 1).cumsum()
There is is the case that, 0 and 1 means the cycle.
In the data of machines, "0 = stopping" and "1 = moving".
We can make a group variable using difference ( Velocity Data ).
df['X3']=df.X2.diff() # Make difference data
df['X4'] = (df['X3'] == 1).cumsum()
After this process, we can use the method "Add Order by Category".
We can make 0-1 data if there is the data having high-low by cycle.
For example, in factory data, electric current, flow rate and tempereture are "low = stopping" and "high = moving".
The method is same to "Add Condition". In this example, the limit is "4".
df['X2']=df.X1.apply(lambda x:0 if x <4 else 1)