How to efficiently unpivot MULTIPLE columns in Hive?
My data is structured like in the below table:
| Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
--------------------------------------------------------
| abcd | 16 | 32 | 14 | 52 | 41 | 17 |
| ... | ... | ... | ... | ... | ... | ... |
I am looking to query the data in Hive in a way such that it looks like this:
| Name | Class | FooVal | BarVal |
----------------------------------
| abcd | A | 16 | 52 |
| abcd | B | 32 | 41 |
| abcd | C | 14 | 17 |
| ... | ... | ... | ... |
I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?
hive query-optimization hiveql unpivot
add a comment |
My data is structured like in the below table:
| Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
--------------------------------------------------------
| abcd | 16 | 32 | 14 | 52 | 41 | 17 |
| ... | ... | ... | ... | ... | ... | ... |
I am looking to query the data in Hive in a way such that it looks like this:
| Name | Class | FooVal | BarVal |
----------------------------------
| abcd | A | 16 | 52 |
| abcd | B | 32 | 41 |
| abcd | C | 14 | 17 |
| ... | ... | ... | ... |
I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?
hive query-optimization hiveql unpivot
add a comment |
My data is structured like in the below table:
| Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
--------------------------------------------------------
| abcd | 16 | 32 | 14 | 52 | 41 | 17 |
| ... | ... | ... | ... | ... | ... | ... |
I am looking to query the data in Hive in a way such that it looks like this:
| Name | Class | FooVal | BarVal |
----------------------------------
| abcd | A | 16 | 52 |
| abcd | B | 32 | 41 |
| abcd | C | 14 | 17 |
| ... | ... | ... | ... |
I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?
hive query-optimization hiveql unpivot
My data is structured like in the below table:
| Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
--------------------------------------------------------
| abcd | 16 | 32 | 14 | 52 | 41 | 17 |
| ... | ... | ... | ... | ... | ... | ... |
I am looking to query the data in Hive in a way such that it looks like this:
| Name | Class | FooVal | BarVal |
----------------------------------
| abcd | A | 16 | 52 |
| abcd | B | 32 | 41 |
| abcd | C | 14 | 17 |
| ... | ... | ... | ... |
I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?
hive query-optimization hiveql unpivot
hive query-optimization hiveql unpivot
edited Nov 20 '18 at 17:50
leftjoin
8,92522151
8,92522151
asked Nov 20 '18 at 16:27
lizliz
112
112
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
CROSS JOIN with class
stack
(see code example) will multiply main table rows x3, one row per class
, then use case
statements to derive your columns depending on class
value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.
set hive.auto.convert.join=true; --this enables map-join
select t.Name,
s.class,
case s.class when 'A' then t.Foo_A
when 'B' then t.foo_B
when 'C' then t.foo_C
end as FooVal,
case s.class when 'A' then t.Bar_A
when 'B' then t.Bar_B
when 'C' then t.Bar_C
end as BarVal
from table t
cross join (select stack(3,'A','B','C') as class) s
;
It will scan the table only once and perform much better than UNION ALL approach.
add a comment |
Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.
select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
from table t1
LATERAL VIEW explode (map(
'A', Foo_A,
'B', Foo_B,
'C', Foo_C
)) t2 as key_1, FooVal
LATERAL VIEW explode (map(
'A', Bar_A,
'B', Bar_B,
'C', Bar_C
)) t3 as key_2, BarVal
where t2.key_1 = t3.key_2;
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53397354%2fhow-to-efficiently-unpivot-multiple-columns-in-hive%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
CROSS JOIN with class
stack
(see code example) will multiply main table rows x3, one row per class
, then use case
statements to derive your columns depending on class
value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.
set hive.auto.convert.join=true; --this enables map-join
select t.Name,
s.class,
case s.class when 'A' then t.Foo_A
when 'B' then t.foo_B
when 'C' then t.foo_C
end as FooVal,
case s.class when 'A' then t.Bar_A
when 'B' then t.Bar_B
when 'C' then t.Bar_C
end as BarVal
from table t
cross join (select stack(3,'A','B','C') as class) s
;
It will scan the table only once and perform much better than UNION ALL approach.
add a comment |
CROSS JOIN with class
stack
(see code example) will multiply main table rows x3, one row per class
, then use case
statements to derive your columns depending on class
value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.
set hive.auto.convert.join=true; --this enables map-join
select t.Name,
s.class,
case s.class when 'A' then t.Foo_A
when 'B' then t.foo_B
when 'C' then t.foo_C
end as FooVal,
case s.class when 'A' then t.Bar_A
when 'B' then t.Bar_B
when 'C' then t.Bar_C
end as BarVal
from table t
cross join (select stack(3,'A','B','C') as class) s
;
It will scan the table only once and perform much better than UNION ALL approach.
add a comment |
CROSS JOIN with class
stack
(see code example) will multiply main table rows x3, one row per class
, then use case
statements to derive your columns depending on class
value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.
set hive.auto.convert.join=true; --this enables map-join
select t.Name,
s.class,
case s.class when 'A' then t.Foo_A
when 'B' then t.foo_B
when 'C' then t.foo_C
end as FooVal,
case s.class when 'A' then t.Bar_A
when 'B' then t.Bar_B
when 'C' then t.Bar_C
end as BarVal
from table t
cross join (select stack(3,'A','B','C') as class) s
;
It will scan the table only once and perform much better than UNION ALL approach.
CROSS JOIN with class
stack
(see code example) will multiply main table rows x3, one row per class
, then use case
statements to derive your columns depending on class
value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.
set hive.auto.convert.join=true; --this enables map-join
select t.Name,
s.class,
case s.class when 'A' then t.Foo_A
when 'B' then t.foo_B
when 'C' then t.foo_C
end as FooVal,
case s.class when 'A' then t.Bar_A
when 'B' then t.Bar_B
when 'C' then t.Bar_C
end as BarVal
from table t
cross join (select stack(3,'A','B','C') as class) s
;
It will scan the table only once and perform much better than UNION ALL approach.
edited Nov 20 '18 at 18:14
answered Nov 20 '18 at 17:36
leftjoinleftjoin
8,92522151
8,92522151
add a comment |
add a comment |
Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.
select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
from table t1
LATERAL VIEW explode (map(
'A', Foo_A,
'B', Foo_B,
'C', Foo_C
)) t2 as key_1, FooVal
LATERAL VIEW explode (map(
'A', Bar_A,
'B', Bar_B,
'C', Bar_C
)) t3 as key_2, BarVal
where t2.key_1 = t3.key_2;
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
add a comment |
Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.
select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
from table t1
LATERAL VIEW explode (map(
'A', Foo_A,
'B', Foo_B,
'C', Foo_C
)) t2 as key_1, FooVal
LATERAL VIEW explode (map(
'A', Bar_A,
'B', Bar_B,
'C', Bar_C
)) t3 as key_2, BarVal
where t2.key_1 = t3.key_2;
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
add a comment |
Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.
select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
from table t1
LATERAL VIEW explode (map(
'A', Foo_A,
'B', Foo_B,
'C', Foo_C
)) t2 as key_1, FooVal
LATERAL VIEW explode (map(
'A', Bar_A,
'B', Bar_B,
'C', Bar_C
)) t3 as key_2, BarVal
where t2.key_1 = t3.key_2;
Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.
select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
from table t1
LATERAL VIEW explode (map(
'A', Foo_A,
'B', Foo_B,
'C', Foo_C
)) t2 as key_1, FooVal
LATERAL VIEW explode (map(
'A', Bar_A,
'B', Bar_B,
'C', Bar_C
)) t3 as key_2, BarVal
where t2.key_1 = t3.key_2;
answered Dec 20 '18 at 16:15
lizliz
112
112
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
add a comment |
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
I'm sorry, but this solution is not faster. Because cross join normally will execute as map-join. And your query does the same twice, multiplying rows x9, after that you filtering rows.
– leftjoin
Feb 5 at 13:34
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53397354%2fhow-to-efficiently-unpivot-multiple-columns-in-hive%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown