C: How to read portion of a file in chunks





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







2















I have to implement for a course assignment the Huffman encryption & decryption algorithm first in the classic way, then I have to try to make it parallel using various methods (openMP, MPI, phtreads). The scope of the project is not to make it necessarily faster, but to analyze the results and talk about them and why are they like that.



The serial version works perfectly. However, for the parallel version, I stumble with a reading from file problem. In the serial version, I have a pice of code that looks like this:



char *buffer = calloc(1, MAX_BUFF_SZ);

while (bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input) > 0) {
compress_chunk(buffer, t, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


This reads at most MAX_BUFF_SZ bytes from the input file and then encrypts them. I used the memset call for the case when bytes_read < MAX_BUFF_SZ (maybe a cleaner solution exists though).



However, for the parallel version (using openMP for example), I want each thread to analyze only a portion of the file, but the reading to be done still in chunks. Knowing that each thread has and id thread_id and there are at most total_threads, I calculate the start and the end positions as following:



int slice_size = (file_size + total_threads - 1) / total_threads;
int start = slice_size * thread_id;
int end = min((thread_id + 1) * slice_size, file_size);


I can move to the start position with a simple fseek(input, start, SEEK_SET) operation. However, I am not able to read the content in chunks. I tried with the following code (just to make sure the operation is okay):



int total_bytes = 0;
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


output is a different file for each thread. Even when I try with just 2 threads, there are some missing characters from them. I think I am close to the right solution and I have something like an error-by-one.



So the question is: how can I read a slice of a file, but in chunks? Can you please help me identify the bug in the above code and make it work?



Edit:
If MAX_BUFF_SZ would be bigger than the size of the input and I'll have for example 4 threads, how should a clean code look to ensure that T0 will do all the job and T1, T2 and T3 will do nothing?



Some simple code that may be use to test the behavior is the following (note that is not from the Huffman code, is some auxiliary code to test things):



#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>

#define MAX_BUFF_SZ 32

#define min(a, b)
({ __typeof__ (a) _a = (a);
__typeof__ (b) _b = (b);
_a < _b ? _a : _b; })

int get_filesize(char *filename) {
FILE *f = fopen(filename, "r");
fseek(f, 0L, SEEK_END);
int size = ftell(f);
fclose(f);

return size;
}

static void compress(char *filename, int id, int tt) {
int total_bytes = 0;
int bytes_read;
char *newname;
char *buffer;
FILE *output;
FILE *input;
int fsize;
int slice;
int start;
int end;

newname = (char *) malloc(strlen(filename) + 2);
sprintf(newname, "%s-%d", filename, id);

fsize = get_filesize(filename);
buffer = calloc(1, MAX_BUFF_SZ);

input = fopen(filename, "r");
output = fopen(newname, "w");

slice = (fsize + tt - 1) / tt;
end = min((id + 1) * slice, fsize);
start = slice * id;

fseek(input, start, SEEK_SET);

while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
printf("%sn", buffer);

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}

fclose(output);
fclose(input);
}

int main() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tt = omp_get_num_threads();;
int id = omp_get_thread_num();
compress("test.txt", id, tt);
}
}


You can compile it with gcc test.c -o test -fopenmp. You may generate a file test.txt with some random characters, more than 32 (or change the max buffer size).



Edit 2:
Again, my problem is reading a slice of a file in chunks, not the analysis per se. I know how to do that. It's an University course, I can't just say "IO bound, end of story, analysis complete".










share|improve this question

























  • Threading the read will not make it any faster. Totally useless.

    – n.m.
    Nov 22 '18 at 20:01











  • The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

    – Adrian Pop
    Nov 22 '18 at 20:03








  • 1





    buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

    – kfx
    Nov 22 '18 at 20:21






  • 1





    Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

    – kfx
    Nov 22 '18 at 20:22













  • @kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

    – Adrian Pop
    Nov 22 '18 at 20:25




















2















I have to implement for a course assignment the Huffman encryption & decryption algorithm first in the classic way, then I have to try to make it parallel using various methods (openMP, MPI, phtreads). The scope of the project is not to make it necessarily faster, but to analyze the results and talk about them and why are they like that.



The serial version works perfectly. However, for the parallel version, I stumble with a reading from file problem. In the serial version, I have a pice of code that looks like this:



char *buffer = calloc(1, MAX_BUFF_SZ);

while (bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input) > 0) {
compress_chunk(buffer, t, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


This reads at most MAX_BUFF_SZ bytes from the input file and then encrypts them. I used the memset call for the case when bytes_read < MAX_BUFF_SZ (maybe a cleaner solution exists though).



However, for the parallel version (using openMP for example), I want each thread to analyze only a portion of the file, but the reading to be done still in chunks. Knowing that each thread has and id thread_id and there are at most total_threads, I calculate the start and the end positions as following:



int slice_size = (file_size + total_threads - 1) / total_threads;
int start = slice_size * thread_id;
int end = min((thread_id + 1) * slice_size, file_size);


I can move to the start position with a simple fseek(input, start, SEEK_SET) operation. However, I am not able to read the content in chunks. I tried with the following code (just to make sure the operation is okay):



int total_bytes = 0;
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


output is a different file for each thread. Even when I try with just 2 threads, there are some missing characters from them. I think I am close to the right solution and I have something like an error-by-one.



So the question is: how can I read a slice of a file, but in chunks? Can you please help me identify the bug in the above code and make it work?



Edit:
If MAX_BUFF_SZ would be bigger than the size of the input and I'll have for example 4 threads, how should a clean code look to ensure that T0 will do all the job and T1, T2 and T3 will do nothing?



Some simple code that may be use to test the behavior is the following (note that is not from the Huffman code, is some auxiliary code to test things):



#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>

#define MAX_BUFF_SZ 32

#define min(a, b)
({ __typeof__ (a) _a = (a);
__typeof__ (b) _b = (b);
_a < _b ? _a : _b; })

int get_filesize(char *filename) {
FILE *f = fopen(filename, "r");
fseek(f, 0L, SEEK_END);
int size = ftell(f);
fclose(f);

return size;
}

static void compress(char *filename, int id, int tt) {
int total_bytes = 0;
int bytes_read;
char *newname;
char *buffer;
FILE *output;
FILE *input;
int fsize;
int slice;
int start;
int end;

newname = (char *) malloc(strlen(filename) + 2);
sprintf(newname, "%s-%d", filename, id);

fsize = get_filesize(filename);
buffer = calloc(1, MAX_BUFF_SZ);

input = fopen(filename, "r");
output = fopen(newname, "w");

slice = (fsize + tt - 1) / tt;
end = min((id + 1) * slice, fsize);
start = slice * id;

fseek(input, start, SEEK_SET);

while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
printf("%sn", buffer);

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}

fclose(output);
fclose(input);
}

int main() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tt = omp_get_num_threads();;
int id = omp_get_thread_num();
compress("test.txt", id, tt);
}
}


You can compile it with gcc test.c -o test -fopenmp. You may generate a file test.txt with some random characters, more than 32 (or change the max buffer size).



Edit 2:
Again, my problem is reading a slice of a file in chunks, not the analysis per se. I know how to do that. It's an University course, I can't just say "IO bound, end of story, analysis complete".










share|improve this question

























  • Threading the read will not make it any faster. Totally useless.

    – n.m.
    Nov 22 '18 at 20:01











  • The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

    – Adrian Pop
    Nov 22 '18 at 20:03








  • 1





    buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

    – kfx
    Nov 22 '18 at 20:21






  • 1





    Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

    – kfx
    Nov 22 '18 at 20:22













  • @kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

    – Adrian Pop
    Nov 22 '18 at 20:25
















2












2








2








I have to implement for a course assignment the Huffman encryption & decryption algorithm first in the classic way, then I have to try to make it parallel using various methods (openMP, MPI, phtreads). The scope of the project is not to make it necessarily faster, but to analyze the results and talk about them and why are they like that.



The serial version works perfectly. However, for the parallel version, I stumble with a reading from file problem. In the serial version, I have a pice of code that looks like this:



char *buffer = calloc(1, MAX_BUFF_SZ);

while (bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input) > 0) {
compress_chunk(buffer, t, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


This reads at most MAX_BUFF_SZ bytes from the input file and then encrypts them. I used the memset call for the case when bytes_read < MAX_BUFF_SZ (maybe a cleaner solution exists though).



However, for the parallel version (using openMP for example), I want each thread to analyze only a portion of the file, but the reading to be done still in chunks. Knowing that each thread has and id thread_id and there are at most total_threads, I calculate the start and the end positions as following:



int slice_size = (file_size + total_threads - 1) / total_threads;
int start = slice_size * thread_id;
int end = min((thread_id + 1) * slice_size, file_size);


I can move to the start position with a simple fseek(input, start, SEEK_SET) operation. However, I am not able to read the content in chunks. I tried with the following code (just to make sure the operation is okay):



int total_bytes = 0;
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


output is a different file for each thread. Even when I try with just 2 threads, there are some missing characters from them. I think I am close to the right solution and I have something like an error-by-one.



So the question is: how can I read a slice of a file, but in chunks? Can you please help me identify the bug in the above code and make it work?



Edit:
If MAX_BUFF_SZ would be bigger than the size of the input and I'll have for example 4 threads, how should a clean code look to ensure that T0 will do all the job and T1, T2 and T3 will do nothing?



Some simple code that may be use to test the behavior is the following (note that is not from the Huffman code, is some auxiliary code to test things):



#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>

#define MAX_BUFF_SZ 32

#define min(a, b)
({ __typeof__ (a) _a = (a);
__typeof__ (b) _b = (b);
_a < _b ? _a : _b; })

int get_filesize(char *filename) {
FILE *f = fopen(filename, "r");
fseek(f, 0L, SEEK_END);
int size = ftell(f);
fclose(f);

return size;
}

static void compress(char *filename, int id, int tt) {
int total_bytes = 0;
int bytes_read;
char *newname;
char *buffer;
FILE *output;
FILE *input;
int fsize;
int slice;
int start;
int end;

newname = (char *) malloc(strlen(filename) + 2);
sprintf(newname, "%s-%d", filename, id);

fsize = get_filesize(filename);
buffer = calloc(1, MAX_BUFF_SZ);

input = fopen(filename, "r");
output = fopen(newname, "w");

slice = (fsize + tt - 1) / tt;
end = min((id + 1) * slice, fsize);
start = slice * id;

fseek(input, start, SEEK_SET);

while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
printf("%sn", buffer);

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}

fclose(output);
fclose(input);
}

int main() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tt = omp_get_num_threads();;
int id = omp_get_thread_num();
compress("test.txt", id, tt);
}
}


You can compile it with gcc test.c -o test -fopenmp. You may generate a file test.txt with some random characters, more than 32 (or change the max buffer size).



Edit 2:
Again, my problem is reading a slice of a file in chunks, not the analysis per se. I know how to do that. It's an University course, I can't just say "IO bound, end of story, analysis complete".










share|improve this question
















I have to implement for a course assignment the Huffman encryption & decryption algorithm first in the classic way, then I have to try to make it parallel using various methods (openMP, MPI, phtreads). The scope of the project is not to make it necessarily faster, but to analyze the results and talk about them and why are they like that.



The serial version works perfectly. However, for the parallel version, I stumble with a reading from file problem. In the serial version, I have a pice of code that looks like this:



char *buffer = calloc(1, MAX_BUFF_SZ);

while (bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input) > 0) {
compress_chunk(buffer, t, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


This reads at most MAX_BUFF_SZ bytes from the input file and then encrypts them. I used the memset call for the case when bytes_read < MAX_BUFF_SZ (maybe a cleaner solution exists though).



However, for the parallel version (using openMP for example), I want each thread to analyze only a portion of the file, but the reading to be done still in chunks. Knowing that each thread has and id thread_id and there are at most total_threads, I calculate the start and the end positions as following:



int slice_size = (file_size + total_threads - 1) / total_threads;
int start = slice_size * thread_id;
int end = min((thread_id + 1) * slice_size, file_size);


I can move to the start position with a simple fseek(input, start, SEEK_SET) operation. However, I am not able to read the content in chunks. I tried with the following code (just to make sure the operation is okay):



int total_bytes = 0;
while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}


output is a different file for each thread. Even when I try with just 2 threads, there are some missing characters from them. I think I am close to the right solution and I have something like an error-by-one.



So the question is: how can I read a slice of a file, but in chunks? Can you please help me identify the bug in the above code and make it work?



Edit:
If MAX_BUFF_SZ would be bigger than the size of the input and I'll have for example 4 threads, how should a clean code look to ensure that T0 will do all the job and T1, T2 and T3 will do nothing?



Some simple code that may be use to test the behavior is the following (note that is not from the Huffman code, is some auxiliary code to test things):



#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>

#define MAX_BUFF_SZ 32

#define min(a, b)
({ __typeof__ (a) _a = (a);
__typeof__ (b) _b = (b);
_a < _b ? _a : _b; })

int get_filesize(char *filename) {
FILE *f = fopen(filename, "r");
fseek(f, 0L, SEEK_END);
int size = ftell(f);
fclose(f);

return size;
}

static void compress(char *filename, int id, int tt) {
int total_bytes = 0;
int bytes_read;
char *newname;
char *buffer;
FILE *output;
FILE *input;
int fsize;
int slice;
int start;
int end;

newname = (char *) malloc(strlen(filename) + 2);
sprintf(newname, "%s-%d", filename, id);

fsize = get_filesize(filename);
buffer = calloc(1, MAX_BUFF_SZ);

input = fopen(filename, "r");
output = fopen(newname, "w");

slice = (fsize + tt - 1) / tt;
end = min((id + 1) * slice, fsize);
start = slice * id;

fseek(input, start, SEEK_SET);

while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;
printf("%sn", buffer);

if (total_bytes >= end) {
int diff = total_bytes - end;
buffer[diff] = '';
break;
}

fwrite(buffer, 1, bytes_read, output);
memset(buffer, 0, MAX_BUFF_SZ);
}

fclose(output);
fclose(input);
}

int main() {
omp_set_num_threads(4);
#pragma omp parallel
{
int tt = omp_get_num_threads();;
int id = omp_get_thread_num();
compress("test.txt", id, tt);
}
}


You can compile it with gcc test.c -o test -fopenmp. You may generate a file test.txt with some random characters, more than 32 (or change the max buffer size).



Edit 2:
Again, my problem is reading a slice of a file in chunks, not the analysis per se. I know how to do that. It's an University course, I can't just say "IO bound, end of story, analysis complete".







c file openmp chunks






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 9:07







Adrian Pop

















asked Nov 22 '18 at 19:57









Adrian PopAdrian Pop

1,17441323




1,17441323













  • Threading the read will not make it any faster. Totally useless.

    – n.m.
    Nov 22 '18 at 20:01











  • The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

    – Adrian Pop
    Nov 22 '18 at 20:03








  • 1





    buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

    – kfx
    Nov 22 '18 at 20:21






  • 1





    Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

    – kfx
    Nov 22 '18 at 20:22













  • @kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

    – Adrian Pop
    Nov 22 '18 at 20:25





















  • Threading the read will not make it any faster. Totally useless.

    – n.m.
    Nov 22 '18 at 20:01











  • The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

    – Adrian Pop
    Nov 22 '18 at 20:03








  • 1





    buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

    – kfx
    Nov 22 '18 at 20:21






  • 1





    Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

    – kfx
    Nov 22 '18 at 20:22













  • @kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

    – Adrian Pop
    Nov 22 '18 at 20:25



















Threading the read will not make it any faster. Totally useless.

– n.m.
Nov 22 '18 at 20:01





Threading the read will not make it any faster. Totally useless.

– n.m.
Nov 22 '18 at 20:01













The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

– Adrian Pop
Nov 22 '18 at 20:03







The scope of the project is not to make it faster, but to analyze the results and talk about why it is not faster. Also, in the example I'm threading the read only for debug purposes, the real version does also the encryption in parallel, so each thread will encrypt a piece of the file and then I'll merge them. Please read the entire post :)

– Adrian Pop
Nov 22 '18 at 20:03






1




1





buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

– kfx
Nov 22 '18 at 20:21





buffer[diff] = ''; - this is wrong. Think of when total_bytes is exactly equal to end. The diff will be zero. So, then you want to keep the whole buffer, and you also want to write it to the output file, which you don't at the moment.

– kfx
Nov 22 '18 at 20:21




1




1





Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

– kfx
Nov 22 '18 at 20:22







Also, you want to compare total_bytes + start to the end, not just total_bytes (assuming you did the initial fseek).

– kfx
Nov 22 '18 at 20:22















@kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

– Adrian Pop
Nov 22 '18 at 20:25







@kfx Yeah, I forgot to add the fwrite in that if (to write the remaining bytes); I also changed the checking to be like int diff = total_bytes - end; buffer[total_bytes - diff] = ''; but there are still some problems.

– Adrian Pop
Nov 22 '18 at 20:25














1 Answer
1






active

oldest

votes


















0














Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):



while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
total_bytes += bytes_read;

if (start + total_bytes > end) {
int diff = start + total_bytes - end;
buffer[bytes_read - diff] = '';
compress_chunk(buffer, t, output, encbuff, &written_bits);
break;
}

compress_chunk(buffer, t, output, encbuff, &written_bits);
memset(buffer, 0, MAX_BUFF_SZ);
}


I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.



Cheers!






share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437381%2fc-how-to-read-portion-of-a-file-in-chunks%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):



    while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
    total_bytes += bytes_read;

    if (start + total_bytes > end) {
    int diff = start + total_bytes - end;
    buffer[bytes_read - diff] = '';
    compress_chunk(buffer, t, output, encbuff, &written_bits);
    break;
    }

    compress_chunk(buffer, t, output, encbuff, &written_bits);
    memset(buffer, 0, MAX_BUFF_SZ);
    }


    I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.



    Cheers!






    share|improve this answer




























      0














      Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):



      while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
      total_bytes += bytes_read;

      if (start + total_bytes > end) {
      int diff = start + total_bytes - end;
      buffer[bytes_read - diff] = '';
      compress_chunk(buffer, t, output, encbuff, &written_bits);
      break;
      }

      compress_chunk(buffer, t, output, encbuff, &written_bits);
      memset(buffer, 0, MAX_BUFF_SZ);
      }


      I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.



      Cheers!






      share|improve this answer


























        0












        0








        0







        Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):



        while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
        total_bytes += bytes_read;

        if (start + total_bytes > end) {
        int diff = start + total_bytes - end;
        buffer[bytes_read - diff] = '';
        compress_chunk(buffer, t, output, encbuff, &written_bits);
        break;
        }

        compress_chunk(buffer, t, output, encbuff, &written_bits);
        memset(buffer, 0, MAX_BUFF_SZ);
        }


        I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.



        Cheers!






        share|improve this answer













        Apparently I just had to take a pen and a paper and make a little scheme. After playing around with some indices, I came out with the following code (encbuff and written_bits are some auxiliary variables I use, since I am actually writing bits to a file and I use an intermediary buffer to limit the writes):



        while ((bytes_read = fread(buffer, 1, MAX_BUFF_SZ, input)) > 0) {
        total_bytes += bytes_read;

        if (start + total_bytes > end) {
        int diff = start + total_bytes - end;
        buffer[bytes_read - diff] = '';
        compress_chunk(buffer, t, output, encbuff, &written_bits);
        break;
        }

        compress_chunk(buffer, t, output, encbuff, &written_bits);
        memset(buffer, 0, MAX_BUFF_SZ);
        }


        I also finished implementing the openMP version. For small files the serial one is faster, but starting from 25+MB, the parallel one starts to beats the serial one with about 35-45%. Thank you all for the advice.



        Cheers!







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 22 '18 at 23:02









        Adrian PopAdrian Pop

        1,17441323




        1,17441323
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53437381%2fc-how-to-read-portion-of-a-file-in-chunks%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

            ComboBox Display Member on multiple fields

            Is it possible to collect Nectar points via Trainline?