Commit Graph

596 Commits

Author SHA1 Message Date
Spyros Zoupanos f8e91cdc5c processLogs.updateProdTables. I need feedback for processLogs.portalStats to see wy they never end 2020-09-13 12:23:03 +03:00
Spyros Zoupanos 9caac3e3e3 portalStats finished - Needs testing. Working on updateProdTables 2020-09-12 21:24:31 +03:00
Spyros Zoupanos 8ddf1dcc15 processPortalLog finished 2020-09-12 20:13:33 +03:00
Spyros Zoupanos 968d53f119 Finished downloadsStats 2020-09-11 20:10:37 +03:00
Spyros Zoupanos f78b5d3f86 More progress on viewsStats 2020-09-10 22:37:48 +03:00
Spyros Zoupanos 2d2d1b9694 More progress on viewsStats 2020-09-10 22:27:19 +03:00
Spyros Zoupanos 1d9f8f79a8 Finished cleanOAI 2020-09-09 21:59:04 +03:00
Spyros Zoupanos 398f1f6f15 More progress. Cleaning view double clicks 2020-09-07 21:57:45 +03:00
Spyros Zoupanos 81102dd791 Removing not needed jar by reflection 2020-09-07 20:54:47 +03:00
Spyros Zoupanos 719f9e3cd9 Adding systout messages (should be transformed to log messages) 2020-09-07 20:44:01 +03:00
Spyros Zoupanos e2c70f64ed More progress on loading JSON Serde jar 2020-09-07 00:01:05 +03:00
Spyros Zoupanos 5af2abbea5 Moving variable declarations to a more appropriate place, adding drop table code 2020-09-04 19:49:07 +03:00
Spyros Zoupanos cf7b9c6db3 More progress on adding queries to the code. Initial database and table creation seems OK. Downloading logs from available piwik_ids 2020-09-02 21:02:56 +03:00
Spyros Zoupanos 637e61bb0f Getting the right piwik_ids from (graph) stats db 2020-09-01 22:06:16 +03:00
Spyros Zoupanos d770d7043d Adding a better .gitignore 2020-09-01 19:42:09 +03:00
Spyros Zoupanos 293d6accd4 More progress on adding piwiklogtmp to the code 2020-09-01 19:05:38 +03:00
Spyros Zoupanos f3dda9858c More progress - Adding queries to code 2020-08-31 23:19:15 +03:00
Spyros Zoupanos 8db9a7ccdc Changes to download Sarc stats 2020-07-25 13:17:47 +03:00
Spyros Zoupanos c035fa7648 Changes to download Irus Stats 2020-07-22 19:22:04 +03:00
Spyros Zoupanos 4c00343bbd More progress 2020-06-05 20:39:51 +03:00
Spyros Zoupanos b213da51c4 Modifying JSON saving procedure to make the files usable by HIVE JsonSerDe 2020-05-21 21:49:33 +03:00
Spyros Zoupanos bf820a98b4 Removing the not needed download code that ignores SSL certificates and uses username/password for authentication. Repository ids are provided manually for the moment until the Hive stats DB provides the correct piwik_id 2020-05-19 18:45:28 +03:00
Spyros Zoupanos 9cdea87c7a More progress on download jsons. All certificates are ignored & authentication is done with username & pass 2020-05-16 13:16:16 +03:00
Spyros Zoupanos 66c7ddfc5e More progress on SQL statements and parameters 2020-05-14 22:27:18 +03:00
Spyros Zoupanos 98ba2d0282 The workflow starts 2020-05-12 20:38:31 +03:00
Spyros Zoupanos 0b6f302652 Adding also an update example with the appropriate table definition 2020-05-11 19:53:41 +03:00
Spyros Zoupanos c0b509abfb Simple java action added.
Simple java connection to hive db + basic statements added
2020-05-09 15:51:22 +03:00
Spyros Zoupanos cabe92d155 Changes to make it compile successfully 2020-05-07 21:46:14 +03:00
Spyros Zoupanos af62b14f91 Adding the main java files, the directory structure and main workflow file 2020-05-07 19:00:03 +03:00
Michele Artini ac0da5a7ee Partial implementation of broker events 2020-05-07 12:31:26 +02:00
Claudio Atzori 17860d3ab6 general changes in the RAW graph mapping: missing collectedfrom/hostedby causes records to be skipped; factored out most of the constants in ModelConstants class (dhp-schemas) 2020-05-06 13:20:02 +02:00
Claudio Atzori fdfecc9578 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-05-06 11:28:01 +02:00
Claudio Atzori c79e2f5977 drop workingPath before starting the dedup workflow 2020-05-06 11:27:44 +02:00
Michele Artini 8f30a09d84 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-05-05 17:12:22 +02:00
Michele Artini ccc609f909 new module for the production of broker events 2020-05-05 17:09:00 +02:00
Claudio Atzori 0825321d0b improved unit tests in dhp-aggregation 2020-05-05 12:39:04 +02:00
Claudio Atzori 4a8487165c using long param names in wf definition 2020-05-04 19:19:29 +02:00
Claudio Atzori a2fc37df5f adjusted parameters 2020-05-04 19:18:59 +02:00
Claudio Atzori f1b7e14036 code formatting 2020-05-04 19:18:34 +02:00
Claudio Atzori 405f495d54 code formatting 2020-05-04 19:18:12 +02:00
Claudio Atzori de5fbe325c bits of javadoc 2020-05-04 16:00:48 +02:00
miconis 085cf173d7 Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop 2020-05-04 12:08:20 +02:00
miconis 3df703f67d mergerels added to propagate relations 2020-05-04 12:08:12 +02:00
Claudio Atzori bac37b3973 fixed children expansion in XML records 2020-05-04 11:51:17 +02:00
Claudio Atzori 077ccd8743 stats wf properties cleanup 2020-05-04 11:41:46 +02:00
Michele Artini eb9bd42970 fixed a problem with journals 2020-04-30 11:06:05 +02:00
Michele Artini a0a6109bbc fixed a problem with journals 2020-04-30 11:03:46 +02:00
Claudio Atzori 439c6255a2 cleanup 2020-04-29 19:09:07 +02:00
Claudio Atzori 77ac995770 cleaned up poms, added descriptions 2020-04-29 18:44:17 +02:00
Claudio Atzori 64d790a266 updated maven plugin dependencies 2020-04-29 16:56:18 +02:00