Top Page | Upper Page | Contents | About This Site | JAPANESE

Impossibility of data science

In the academic world, various theories have been discovered that show that something is "impossible," and there are even books summarizing such things.

Data has the aspect of using symbols and the aspect of being obtained through recognition and measurement. There is a theory of impossibility for both aspects, and it is a good content to be aware of somewhere in advancing data science .

Impossibility of Symbolic Explanation

Ambiguity in the language itself

In semiotics, philosophy, and linguistics, the same word can have different meanings depending on the person, and even the same person can have different meanings depending on the time and situation . It is said that the meaning cannot be fixed.

There is ambiguity in the "language" that is necessary to give a perfect explanation, and a perfect explanation cannot be given as long as the language is used.

Logical impossibility of proof (Godel's incompleteness theorem)

It's a theory of logic . A world of logic that seems perfect, but actually proves otherwise.

Arguing only in terms of symbols avoids linguistic ambiguity, but even so it shows that it is unreasonable.

Impossibility of method of reasoning

Of logical reasoning , except deductive methods, there are leaps of logic. For example, in the method of induction, we collect several facts and use them as grounds to make inferences such as "So this is it", but there is a leap here.

In everyday life, we frequently use these leaps to advance cognition and learning .

However, since it is a conclusion based on a limited number of facts, if you ask, "Is there really no example that is different from the facts?", no one will know. If you say, "If no one knows, you can't prove it." In business, it's a good idea to take what you don't know as a risk and move forward with what you know.

Arrow's impossibility theorem

It's a theory of sociology . The original definition of a perfect democracy proves that no electoral method is possible.

Impossibility of recognition or measurement

Impossibility due to dark data

There is unidentifiable data that goes by the name of " dark data ".

There are times when expectations are placed on data, saying, "There are hidden treasures in data. That's why we must unearth treasures from data."

Impossibility due to mismatch of measurement methods

A ruler with a width of 1mm cannot measure the size of something as small as an atom . It also cannot measure something as large as the distance to the moon.

In this way, there is the possibility that you cannot measure anything outside the range that the measurement system assumes.

The nature and speed of what we can perceive with our five senses is also limited by the human measurement system. However, there are many things that we don't know about how humans work, so it seems that the theory that was said to be impossible may change.

Impossibility due to a mismatch between the name of a discipline and its scope

Mismatch in measurement methods is before we get the data, but this is after we get the data.

There are fields such as " causal inference ", " time series analysis ", and " quality engineering ".

Details are explained in each item on this site. It has become an academic subject. Therefore, there are many causal problems in the world that cannot be helped by studying the field of "causal inference". The same thing happens in time series analysis and quality engineering.

In my experience, it is often the case that the "cause and effect", "time series", and "quality" that I am trying to deal with do not match existing academic disciplines.

Impossibility due to the difficulty of obtaining an exact solution

In complex systems , even deterministic models have been shown to fail to make deterministic predictions. A familiar example is the weather forecast.

Since it is a deterministic model, it is possible to predict the future through simulation, but even a slight change in the initial values ??will change the results significantly. For example, the initial values are 10.001, 10.002, and 10.003, and the results change. In this example, the only difference is the 5th digit. Therefore, the initial value cannot be determined and prediction is not possible. Also, in such an unstable system, even if the initial values can be determined in detail, it is not possible to know how accurate the intermediate calculations will be, and the prediction results will be unreliable.

Heisenberg uncertainty principle

Quantum physics is a theory in the field of studying the microscopic state of matter. Positions and velocities are deterministic in everyday life, but electrons cannot be treated as such.

In physics, the point is how to describe these properties of electrons and how to apply them.

The Uncertainty Principle, which has become very famous outside of physics, seems to be rooted in its incompatibility with everyday sensibilities. Various stories have emerged to fill that gap.

Impossibility under the influence of the act of knowing

What is obtained by measurement, in other words, data is considered to be "representative of what is measured". For example, when measuring the length with a ruler, the obtained data "XX mm" certainly measures the object of that length.

However, this may not be the case, and may be influenced by the act of "measurement" itself. For example, in a survey , the order in which questions are asked and the use of the words in the question will change the answer. Also, in physical measurements, the light you use to see can change what is being measured, so what happens when you are looking is different than what is happening when you are not.

In the literature that describes the uncertainty principle, it is sometimes explained that "the uncertainty is due to the influence of the act of measuring". It is different from the effect of measurement because it has an indeterminate nature.

Uncertainty of statistical decisions

Statistics deals with the property that when you try to look at a lot of things as a whole, you can treat them like a normal distribution . Once we assume the distribution, we can think of which value it will be probabilistic. The point is that although the value is determined according to determinism each time, when dealing with such phenomena as a whole, it is expressed using probability.

For example, you will be able to think that "a die roll has a probability of 1 in 6".

Variation due to distribution becomes an uncertain factor when trying to treat things statistically.

As an understanding of the uncertainty principle, there seems to be a confusion between thinking that things that follow determinism are probabilistic and that "one electron itself has a spread". is.

Using the Theory of Impossibility

The theory of impossibility expands the imagination with analogies and applies it to a completely different field, and it becomes a new approach to that field.

However, the theory of impossibility is based on certain conditions and definitions of words, so it is abusive to ignore the conditions, extract only symbolic expressions, and talk about them as if they were the truth of all things. . Also, even if the conditions are met, I don't know if it will be "impossible" in other fields as well.

Therefore, it may be good to write it as a hypothesis in another field, but writing "the impossible has been proven" is overkill. We sometimes see such excesses in scientific explanations.





NEXT Statistics