What is skewed Data?
Skewness refers to the value distribution in a given dataset. When we say that there is highly skewed data, it means that some column values have more rows and some very few, i.e., the data is not properly/evenly distributed.
Data skewness affects the performance and parallelism in any distributed system.