Melt unpivots a table from wide to long format, optionally leaving identifiers set. This operation is useful to massage a table into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.

  • id_vars: These are the columns in the original dataset that you want to keep as identifier variables. They will remain unchanged in the resulting long format table. Identifier variables are used to uniquely identify each observation in the dataset.
  • value: These are the columns in the original dataset that you want to unpivot or "melt" into a single column. They are considered measured variables and will be combined into two new columns in the resulting long format table: 'variable' and 'value'.

Example

Let's consider a dataset showing the average temperature for a city over a few months. Here's how the original dataset looks like:

YearJanuaryFebruaryMarchApril
202032354250
202131364548

In this dataset, each row represents a year, and the columns represent the average temperature for each month in that year.

Now, let's use the Pandas 'melt' function to transform this dataset into a more organized format:

YearMonthTemperature
2020January32
2020February35
2020March42
2021April50
2021January31
2021February36
2021March45
2021April48

Now the dataset is easier to read and analyze, with each row showing the average temperature for a specific month in a given year.

Example 2

Dataset:

CountryYearPopulationGDP
USA20203310000005.0
China202014393237766.5

To melt this dataset using the melt function, you can set id_vars to ['Country', 'Year'] and value to ['Population', 'GDP']. The resulting long format table would look like this:

CountryYearvariablevalue
USA2020Population331000000
China2020Population1439323776
USA2020GDP5.0
China2020GDP6.5