# docsbox [![Build Status](https://travis-ci.org/dveselov/docsbox.svg?branch=master)](https://travis-ci.org/dveselov/docsbox) `docsbox` is a standalone service that allows you convert office documents, like .docx and .pptx, into more useful filetypes like PDF, for viewing it in browser with PDF.js, or HTML for organizing full-text search of document content. `docsbox` uses **LibreOffice** (via **LibreOfficeKit**) for document converting. ```bash $ curl -F "file=@kittens.docx" http://localhost/api/v1/ { "id": "9b643d78-d0c8-4552-a0c5-111a89896176", "status": "queued" } $ curl http://localhost/api/v1/9b643d78-d0c8-4552-a0c5-111a89896176 { "id": "9b643d78-d0c8-4552-a0c5-111a89896176", "result_url": "/media/9b643d78-d0c8-4552-a0c5-111a89896176.zip", "status": "finished" } $ curl -O http://localhost/media/9b643d78-d0c8-4552-a0c5-111a89896176.zip $ unzip -l 9b643d78-d0c8-4552-a0c5-111a89896176.zip Archive: 9b643d78-d0c8-4552-a0c5-111a89896176.zip Length Date Time Name --------- ---------- ----- ---- 11135 2016-07-08 05:31 txt 373984 2016-07-08 05:31 pdf 147050 2016-07-08 05:31 html --------- ------- 532169 3 files ``` ```bash $ cat options.json { "formats": ["pdf"], "thumbnails": { "size": "640x480", } } $ curl -i -F "file=@kittens.ppt" -F "options=80/tcp docsbox_nginx_1 f6b55773c71d docsbox_rqworker "rq worker -c docsbox" 15 minutes ago Up 8 minutes docsbox_rqworker_1 662b08daefea docsbox_rqscheduler "rqscheduler -H redis" 15 minutes ago Up 8 minutes docsbox_rqscheduler_1 0364df126b36 docsbox_web "gunicorn -b :8000 do" 15 minutes ago Up 8 minutes 8000/tcp docsbox_web_1 5e8c8481e288 redis:latest "docker-entrypoint.sh" 9 hours ago Up 8 minutes 0.0.0.0:6379->6379/tcp docsbox_redis_1 ``` # Settings (env) ``` REDIS_URL - redis-server url (default: redis://redis:6379/0) REDIS_JOB_TIMEOUT - job timeout (default: 10 minutes) ORIGINAL_FILE_TTL - TTL for uploaded file in seconds (default: 10 minutes) RESULT_FILE_TTL - TTL for result file in seconds (default: 24 hours) THUMBNAILS_DPI - thumbnails dpi, for bigger thumbnails choice bigger values (default: 90) LIBREOFFICE_PATH - path to libreoffice (default: /usr/lib/libreoffice/program/) ``` # Scaling Within a single physical server, docsbox can be scaled by docker-compose: ```bash $ docker-compose scale web=4 rqworker=8 ``` For multi-host deployment you'll need to create global syncronized volume (e.g. with flocker), global redis-server and mount it at `docker-compose.yml` file. # Supported filetypes | Input | Output | Thumbnails | | ---------------------------------- | ------------------- | ---------- | | Document `doc` `docx` `odt` `rtf` | `pdf` `txt` `html` | `yes` | | Presentation `ppt` `pptx` `odp` | `pdf` `html` | `yes` | | Spreadsheet `xls` `xlsx` `ods` | `pdf` `csv` `html` | `yes` |