Top Page | Upper Page | Contents | About This Site | JAPANESE

Make Data of Meta Knowledge

There is a case that, evenif the order of data has a meaning, it is not in the data because it is meta knowledge .

But if we add the meaning in the data, our output of analysis could be rich.

Image of this activity by Excel is not difficult because we can put the meaning on the data directly using functions. But I was in trouble when I wanted to do this activity using Python.

This page is the memo using Python.

Prepare to use Python

Codes in this page start the phase that there is an input data named "df".

If start data is a csv file. Code to make "df" is bellow.

import pandas as pd # Read package
df= pd.read_csv("Data.csv")# Read data

Add Condition

Make Data of Meta Knowledge
This example has limit "X1 = 4" to decide "OK" or "NG".

In Excel, I use IF-function for this process.

df['C2']=df.X1.apply(lambda x:'OK' if x <4 else 'NG')

Add Order by Category

Make Data of Meta Knowledge
Add order by same category. After this process, we can do Passing Analysis .

In Excel, I use IF-function in complicated use.

In the manufacturing industry, this category are "Lot" or "Batch".

df['X2']=df.groupby('C1').cumcount()+1
Without the last "+1", orders start from "0".

If we use sets of categories, the code is bellow.
df['X2']=df.groupby(['C1','C2']).cumcount()+1

Make Group Variable Using the Meta Knowledge of the Order

Make Data of Meta Knowledge
There is the case that repeats of orders means the cycle. In the manufacturing industry, this cycle are "Lot" or "Batch".

If we want to analyze the difference of lots or batchs, we need group variable.

df['X3'] = (df['X2'] == 1).cumsum()

Make Group Variable Using the Meta Knowledge of 0-1 Data

Make Data of Meta Knowledge
There is is the case that, 0 and 1 means the cycle. In the data of machines, "0 = stopping" and "1 = moving".

We can make a group variable using difference ( Velocity Data ).

df['X3']=df.X2.diff() # Make difference data
df['X4'] = (df['X3'] == 1).cumsum()

After this process, we can use the method "Add Order by Category".

Make Group 0-1 Data Using the Meta Knowledge

Make Data of Meta Knowledge
We can make 0-1 data if there is the data having high-low by cycle.

For example, in factory data, electric current, flow rate and tempereture are "low = stopping" and "high = moving".

The method is same to "Add Condition". In this example, the limit is "4".

df['X2']=df.X1.apply(lambda x:0 if x <4 else 1)




NEXT Outlier and Missing Value

Tweet