Melt unpivots a table from wide to long format, optionally leaving identifiers set. This operation is useful to massage a table into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.
- id_vars: These are the columns in the original dataset that you want to keep as identifier variables. They will remain unchanged in the resulting long format table. Identifier variables are used to uniquely identify each observation in the dataset.
- value: These are the columns in the original dataset that you want to unpivot or "melt" into a single column. They are considered measured variables and will be combined into two new columns in the resulting long format table: 'variable' and 'value'.
Let's consider a dataset showing the average temperature for a city over a few months. Here's how the original dataset looks like:
In this dataset, each row represents a year, and the columns represent the average temperature for each month in that year.
Now, let's use the Pandas 'melt' function to transform this dataset into a more organized format:
Now the dataset is easier to read and analyze, with each row showing the average temperature for a specific month in a given year.
To melt this dataset using the melt function, you can set id_vars to ['Country', 'Year'] and value to ['Population', 'GDP']. The resulting long format table would look like this:
Updated 5 months ago