Difference Between a Staging and the Production DWH
I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.
But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?
From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.
sql-server data-warehouse
add a comment |
I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.
But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?
From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.
sql-server data-warehouse
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44
add a comment |
I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.
But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?
From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.
sql-server data-warehouse
I have a general question on a Data Warehouse Architecture. I have heard a lot of a Staging and a Production, so that data should be loaded via SSIS first into a staging dwh and then via SSIS to the Production environment.
But is there any difference? I mean is it only a "copy" of the production data warehouse (to do some quality checks on data before pushing to production?) or of the source data in a different model or (different schema)?
From software development perspective i know that both are identical and the only purpose is to check the software version from all stakeholders and get the release to deploy on prod. Normally in the software case the staging is same (maybe only hardware sizing is different) to production.
sql-server data-warehouse
sql-server data-warehouse
asked Feb 17 at 18:13
STORMSTORM
1184
1184
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44
add a comment |
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44
add a comment |
3 Answers
3
active
oldest
votes
In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)
The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.
In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:
- translating raw data into dimensions
- grouping and aggregating
- cleaning (e.g. how to deal with missing values)
Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).
In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.
add a comment |
I think this is a case of the same word being used to describe two different things.
The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.
The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.
Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.
Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.
add a comment |
As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).
I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system
Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders
Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.
Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.
This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:
- 1. Sandbox
- 2. int
- 3. staging(UAT/production)
- 4. Live (production)
I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.
New contributor
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229988%2fdifference-between-a-staging-and-the-production-dwh%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)
The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.
In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:
- translating raw data into dimensions
- grouping and aggregating
- cleaning (e.g. how to deal with missing values)
Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).
In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.
add a comment |
In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)
The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.
In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:
- translating raw data into dimensions
- grouping and aggregating
- cleaning (e.g. how to deal with missing values)
Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).
In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.
add a comment |
In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)
The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.
In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:
- translating raw data into dimensions
- grouping and aggregating
- cleaning (e.g. how to deal with missing values)
Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).
In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.
In general 'stage' or 'staging' is a sort of intermediate place or "Holding area" for data before being put into its final format in the presentation layer. (Within your environments (e.g. servers) it could be on the same server, or a different one, than the final format -- depending on your individual setup. Typically it would be at least in a different database, on the same server.)
The important point is that 'staging' is generally a place where data -- which could be from various different sources, depending on what you are ingesting into your DWH -- gets landed initially, and then it gets processed into the final form that is consumed by your business users.
In a minimal case it could be just a "copy" of what's in production, before you push it, but typically there would be additional processing carried out on it before putting into production such as:
- translating raw data into dimensions
- grouping and aggregating
- cleaning (e.g. how to deal with missing values)
Staging has a different meaning in data warehousing than it does in general software development (I know what you are getting at -- the existence of a 'stage' environment between UAT and production which is the same as production but a simulated copy).
In my (DWH) experience Staging is generally in a pretty similar format to the raw data, e.g. if you have 'Account' on your source server then you could have 'StageAccount' in the staging db, and then you would carry out some more processing on the StageAccount to transform it into (e.g.) DimAccount records.
answered Feb 17 at 20:36
seventyeightistseventyeightist
93948
93948
add a comment |
add a comment |
I think this is a case of the same word being used to describe two different things.
The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.
The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.
Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.
Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.
add a comment |
I think this is a case of the same word being used to describe two different things.
The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.
The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.
Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.
Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.
add a comment |
I think this is a case of the same word being used to describe two different things.
The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.
The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.
Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.
Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.
I think this is a case of the same word being used to describe two different things.
The first being a staging environment. As you mention, this is a near copy of the production environment architecture and can be used to test releases that are due to be moved into production or to allow users to view upcoming features before they are released.
The second is a data staging area. Although it is called staging, it could actually be located within the production environment (as has been the case in most companies that I have worked for). This area is used to cleanse and validate data extracted from OLTP systems before being loaded into a data warehouse.
Generally, at least in my work history, I have found that production data may not be copied into the staging environment due to legal or logistical constraints. Mostly I have found that it is because developers/testers should not access sensitive data such as personal details, credit cards etc... but it can be copied into a data staging area in preparation of loading it into the data warehouse.
Over time, as you gain more experience it gets easier to tell which one an individual is referring to - but until then, just remember that live data should not leave the production environment and it is perfectly fine to have a dedicated area within production in which you can process your data.
answered Feb 17 at 18:33
Mr.BrownstoneMr.Brownstone
9,19432342
9,19432342
add a comment |
add a comment |
As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).
I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system
Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders
Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.
Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.
This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:
- 1. Sandbox
- 2. int
- 3. staging(UAT/production)
- 4. Live (production)
I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.
New contributor
add a comment |
As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).
I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system
Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders
Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.
Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.
This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:
- 1. Sandbox
- 2. int
- 3. staging(UAT/production)
- 4. Live (production)
I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.
New contributor
add a comment |
As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).
I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system
Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders
Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.
Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.
This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:
- 1. Sandbox
- 2. int
- 3. staging(UAT/production)
- 4. Live (production)
I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.
New contributor
As a SF Developer, we have processes when it comes to this type of thing you're describing, firstly we do not use staging for direct changes to the system rather we create new sandboxes with referenced Jira ticket numbers (e.g. SALLY-369).
I'd like to also mention that we have data and metadata which I hope you're familiar with; when we get a ticket that is just a data change we do it straight into staging, and then once we get the all clear we create whatever it be into the live system
Secondly, I'd like to explain the whole pushing process at least for our company when a ticket is created we create a new sandbox that is a direct copy of staging but in no way affects staging, we then make the changes in the newly created sandbox when the changes are made in the sandbox and get the go-ahead to push into staging for testing we only use staging for testing/stakeholders
Now the reason we do this is due to things being overwritten if changes are made DIRECT into staging which is inefficient and overall bad practice.
Now production, I assume you're talking about the live system (as production can be staging or live as they're under the same category) Live is an out of date version of staging, we want this though, as Live is customer facing we have releases which happen every few weeks ( this is where the staging date is analysed and deemed good to go) and thus is then pushed into live.
This is how the whole process goes, don't worry about int, its just a derivative of staging that is the first step into building into staging:
- 1. Sandbox
- 2. int
- 3. staging(UAT/production)
- 4. Live (production)
I hope this clears things up for you if not, reply to this and I'll be happy to explain in more detail for you.
New contributor
New contributor
answered Feb 18 at 10:22
Harry SmithHarry Smith
1
1
New contributor
New contributor
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229988%2fdifference-between-a-staging-and-the-production-dwh%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are you a software dev/admin/ops person now being asked to deal with a data warehouse? What's the context for this question please?
– seventyeightist
Feb 17 at 20:39
Exactly. I am normally a classical software developer and nowadays more involved in bi topics.
– STORM
Feb 18 at 6:44