More

Why result of merge of multiple raster is so big?


I try to merge 14 geotiff like this :

Each geotiff is about 50Mb. I need a geotiff at the output

My workflow :

gdalbuildvrt -input_file_list list.txt test.vrt

(where my list contains the name of the tifs)

Then :

gdal_translate -of Gtiff test.vrt test.tif Input file size is 79841, 59955

It works, but the result is a geotiff of 13,3 Gb ! For 14 files, each 50 Mb, I attempted a geotiff of 700 Mb, not 13 Gb.

I know that gdal does not compress by default, so I tried this command :

gdal_translate -of Gtiff -co COMPRESS=JPEG test.vrt test_compressed.tif

But the "merge" of the file is too big for JPEG compression :

Input file size is 79841, 59955 0ERROR 1: JPEGPreEncode:Strip/tile too large for JPEG ERROR 1: WriteEncodedTile/Strip() failed. ERROR 1: JPEGPreEncode:Strip/tile too large for JPEG ERROR 1: WriteEncodedTile/Strip() failed. ERROR 1: An error occured while writing a dirty block…

So I tried another workflow, and converted all my tifs in jpeg (14 Mb each), build a vrt file and translated it with LZW compression. But the output geotiff is about 5 Gb.

Could you tell me what is the best practice to do the job, and if it is possible to obtain one geotiff of 14*50Mb ?

I didn't try it, but I thought about merge this tifs in photoshop, then re-georeference with upper left/lower right coordinates. With this workflow I think I will have 14*50 Mb, but I'm not sure. And I want to learn gdal best practises so I didn't try it for the moment


coming to bites: if input is tif with 8bit and export is 32bit by default you will get serious trouble. so make sure you keep your byte definition as it is. And remember: the full tif will prob. have 20x 50mb as a tiff is always rectangular

If I understand, the number I pointed in green on this screenshot have to be the same on the left and on the right ?

Your output image will have more pixels than the sum of your input images, but this does not explain the large difference. I suggest that you look at the characteristics of your images based on gdalinfo in order to see what compression is used and check that the extents are correct.

The 14 tifs of 50 Mb were originally 14 tifs of 700 Mb that I processed with gdal_translate with -co COMPRESS=JPEG. I compressed the raster in order to reduce the number of Mb, but maybe it was not a good idea ?

This screenshot represents the 2 gdal info of the same geotiff ( 01.tif) , at the left of the screenshot is is the gdalinfo of the non-compressed Gtiff of 700 Mb, end a the right the same Gtiff with COMPRESS=JPEG, so 50 Mb, with the diff in green:

According to me, the extents are correct because in qgis it matches with other source of data and satellite imagery.

*assuming your input images have the same size, it makes 20000 * 12000 pixels per input images, which is large for an image of 50 Mb, maybe you are crossing the extent of your coordinate system when you create the mosaic.*

I'm not sure to understand what you mean by "crossing the extent". But I tried to open my 5 Gb LZW in QGIS, and the extent is good, because it matches with other source of data.

Your answer makes me realize that the Gtiff have not the same size, do you think it could be the cause of the increasing of the size when merged ? Because gdal prefers file of the same size. I made a gdalinfo on each Gtiff to get its size, there is a very small difference between the size of Gtiffs :

02.tif Size is 19956, 11981 03.tif Size is 19959, 11993 04.tif Size is 19961, 11992 05.tif Size is 19958, 11993 06.tif Size is 19958, 11990 07.tif Size is 19956, 11984 08.tif Size is 19956, 11993 09.tif Size is 19958, 11993 10.tif Size is 19958, 11989 11.tif Size is 19958, 11985 12.tif Size is 19958, 11993 13.tif Size is 19959, 11993 14.tif Size is 19960, 11994

Then you should look at the pixel depth of your images : if your input were in Bytes, >then you should keep bytes. gdal_translate -of Gtiff -ot Byte -co COMPRESS=LZW test.vrt test.tif

I tried this command but the gdal told me that the size of the tiff was exceeded.

Input file size is 79841, 59955 0… 10… 20… 30… 40… 50… ERROR 1: TIFFAppendToStrip:Maximum TIFF file size exceeded. Use BIGTIFF=YES creation option. ERROR 1: WriteEncodedTile/Strip() failed.

But if I have to create a big tiff, it does not resolve my problem because it is more than 4 Gb. Is the pixel depth important in my case ? ( HD photo of maps, then georeferenced, not DEM)

Remark 1 : Converting your images to jpeg before building a vrt doesn't help and you might loose data.

It is not serious if I loose a little bit information. I prefer not to loose it, of course, but if I have to it's not a problem. I was convinced that the output would be lighter if I worked with jpeg, but as a conclusion, it's not true when the output is Gtiff. So this is not a good solution. I give up this solution.

> Remark 2 : using a vrt is helpful: are you sure that you need GTiff ?

Yes, I need a Gtiff, because I have to import it in a mobile application which need geotiff input to work ( I think the app can take geospatial pdf input too, but I never work with it, and I want to understand my problem with gdal, because it is not the first time I have it).


I tried -co tiled=yes -co bigtiff=yes -co compress=jpeg -co photometric=ycbcr and i tried -co TILED=yes -co BLOCKXSIZE=512 -co BLOCKYSIZE=512

These 2 commands work well, I have a size of ~700 Mb. It is exactly what I expected.

Now I have another problem : it can't be opened quickly by QGIS. I have to wait 15 minutes ( but I quit before QGIS open the tif successfully). I don't know why. And in my android app, it does not work (maybe cause of "tiled=yes"). I have to read some doc on my own.


Your output image will have more pixels than the sum of your input images, but this does not explain the large difference. I suggest that you look at the characteristics of your images based on gdalinfo in order to see what compression is used and check that the extents are correct. (assuming your input images have the same size, it makes 20000 * 12000 pixels per input images, which is large for an image of 50 Mb, maybe you are crossing the extent of your coordinate system when you create the mosaic.) Then you should look at the pixel depth of your images : if your input were in Bytes, then you should keep bytes.

gdal_translate -of Gtiff -ot Byte -co COMPRESS=LZW test.vrt test.tif

Remark 1 : Converting your images to jpeg before building a vrt doesn't help (it will be uncompressed before the next step) and you might loose data.

Remark 2 : using a vrt is helpful: are you sure that you need GTiff ?

EDIT: There is no miracle with the size of your images, but you should use a tiled tif as output so that you can use the jpeg compression with your large data (-co TILED=yes -co BLOCKXSIZE=512 -co BLOCKYSIZE=512 ). If it remains too big, the only solution is then to use gdalwarp to resample at a lower resolution.


Watch the video: How to deal with the multiple-extent-problem in raster stacking with R. (September 2021).