Create a JSON file from multiple dataframes by selecting rows based on keys
up vote
-2
down vote
favorite
I have n dataframes and all have common key columns. I want to take row from these n dataframes for which the key matches and create nested json file by differentiating columns by dataframe.
example: In this example i am using 3 dataframes only
dataframe1:
---------|-----------+-------+
| keycol | df1col1 |df1col2|
+--------|-----------+-------+
| x | a | 1 |
| y | b | 2 |
| z | c | 3 |
| p | d | 4 |
+--------------------+-------+
dataframe 2:
---------|-----------+-------+
| keycol | df2col1 |df2col2|
+--------|-----------+-------+
| x | m | 5 |
| y | n | 6 |
| z | o | 7 |
| p | p | 8 |
+--------------------+-------+
dataframe 3:
---------|-----------+-------+
| keycol | df3col1 |df3col2|
+--------|-----------+-------+
| x | g | 9 |
| y | h | 10 |
| z | i | 11 |
| p | j | 12 |
+--------------------+-------+
I should be able to create separate json files for each key related data from multiple dataframes. Sample json output structure i am looking for is depicted below.This is only for first key record like wise i want to create json files for all the key related data.
{
"keycol": "x"
{
"dataframe1"
{
"df1col1": "a"
"df1col2"" "1"
}
"dataframe2"
{
"df2col1": "m"
"df2col2"" "5"
}
"dataframe3"
{
"df3col1": "g"
"df3col2"" "9"
}
}
}
Thank you all for the great help. Thanks in advance.
apache-spark pyspark apache-spark-sql pyspark-sql
add a comment |
up vote
-2
down vote
favorite
I have n dataframes and all have common key columns. I want to take row from these n dataframes for which the key matches and create nested json file by differentiating columns by dataframe.
example: In this example i am using 3 dataframes only
dataframe1:
---------|-----------+-------+
| keycol | df1col1 |df1col2|
+--------|-----------+-------+
| x | a | 1 |
| y | b | 2 |
| z | c | 3 |
| p | d | 4 |
+--------------------+-------+
dataframe 2:
---------|-----------+-------+
| keycol | df2col1 |df2col2|
+--------|-----------+-------+
| x | m | 5 |
| y | n | 6 |
| z | o | 7 |
| p | p | 8 |
+--------------------+-------+
dataframe 3:
---------|-----------+-------+
| keycol | df3col1 |df3col2|
+--------|-----------+-------+
| x | g | 9 |
| y | h | 10 |
| z | i | 11 |
| p | j | 12 |
+--------------------+-------+
I should be able to create separate json files for each key related data from multiple dataframes. Sample json output structure i am looking for is depicted below.This is only for first key record like wise i want to create json files for all the key related data.
{
"keycol": "x"
{
"dataframe1"
{
"df1col1": "a"
"df1col2"" "1"
}
"dataframe2"
{
"df2col1": "m"
"df2col2"" "5"
}
"dataframe3"
{
"df3col1": "g"
"df3col2"" "9"
}
}
}
Thank you all for the great help. Thanks in advance.
apache-spark pyspark apache-spark-sql pyspark-sql
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57
add a comment |
up vote
-2
down vote
favorite
up vote
-2
down vote
favorite
I have n dataframes and all have common key columns. I want to take row from these n dataframes for which the key matches and create nested json file by differentiating columns by dataframe.
example: In this example i am using 3 dataframes only
dataframe1:
---------|-----------+-------+
| keycol | df1col1 |df1col2|
+--------|-----------+-------+
| x | a | 1 |
| y | b | 2 |
| z | c | 3 |
| p | d | 4 |
+--------------------+-------+
dataframe 2:
---------|-----------+-------+
| keycol | df2col1 |df2col2|
+--------|-----------+-------+
| x | m | 5 |
| y | n | 6 |
| z | o | 7 |
| p | p | 8 |
+--------------------+-------+
dataframe 3:
---------|-----------+-------+
| keycol | df3col1 |df3col2|
+--------|-----------+-------+
| x | g | 9 |
| y | h | 10 |
| z | i | 11 |
| p | j | 12 |
+--------------------+-------+
I should be able to create separate json files for each key related data from multiple dataframes. Sample json output structure i am looking for is depicted below.This is only for first key record like wise i want to create json files for all the key related data.
{
"keycol": "x"
{
"dataframe1"
{
"df1col1": "a"
"df1col2"" "1"
}
"dataframe2"
{
"df2col1": "m"
"df2col2"" "5"
}
"dataframe3"
{
"df3col1": "g"
"df3col2"" "9"
}
}
}
Thank you all for the great help. Thanks in advance.
apache-spark pyspark apache-spark-sql pyspark-sql
I have n dataframes and all have common key columns. I want to take row from these n dataframes for which the key matches and create nested json file by differentiating columns by dataframe.
example: In this example i am using 3 dataframes only
dataframe1:
---------|-----------+-------+
| keycol | df1col1 |df1col2|
+--------|-----------+-------+
| x | a | 1 |
| y | b | 2 |
| z | c | 3 |
| p | d | 4 |
+--------------------+-------+
dataframe 2:
---------|-----------+-------+
| keycol | df2col1 |df2col2|
+--------|-----------+-------+
| x | m | 5 |
| y | n | 6 |
| z | o | 7 |
| p | p | 8 |
+--------------------+-------+
dataframe 3:
---------|-----------+-------+
| keycol | df3col1 |df3col2|
+--------|-----------+-------+
| x | g | 9 |
| y | h | 10 |
| z | i | 11 |
| p | j | 12 |
+--------------------+-------+
I should be able to create separate json files for each key related data from multiple dataframes. Sample json output structure i am looking for is depicted below.This is only for first key record like wise i want to create json files for all the key related data.
{
"keycol": "x"
{
"dataframe1"
{
"df1col1": "a"
"df1col2"" "1"
}
"dataframe2"
{
"df2col1": "m"
"df2col2"" "5"
}
"dataframe3"
{
"df3col1": "g"
"df3col2"" "9"
}
}
}
Thank you all for the great help. Thanks in advance.
apache-spark pyspark apache-spark-sql pyspark-sql
apache-spark pyspark apache-spark-sql pyspark-sql
asked Nov 14 at 5:56
darla
43
43
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57
add a comment |
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53293973%2fcreate-a-json-file-from-multiple-dataframes-by-selecting-rows-based-on-keys%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can someone help me out please. I am new to spark.
– darla
Nov 14 at 19:57