Oozie workflow for cleancontext #216
No reviewers
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
RDGraph
RSAC
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/dnet-hadoop#216
Loading…
Reference in New Issue
No description provided.
Delete Branch "cleancontext"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR extends the oozie workflow used for the cleaning with a last step to remove not wanted contexts from results. It adds also one parameter to the cleaning workflow shouldCleanContext that, if true, performes the cleaning
@ -16,0 +18,4 @@
<description>true if the context have to be cleaned</description>
</property>
<property>
<name>contextId</name>
It should be better to include a description for this parameter, to explain its purpose and if it possible to include multiple contentIds, how they should be formatted.
This is just the first naive implementation of the context cleaning. I have no idea how it will be once done properly
It might be the 1st naive implementation, but looking at the oozie workflow, it is not obvious what a parameter plays when it is not accompanied by any description.
extended
@ -16,0 +22,4 @@
<value>sobigdata</value>
</property>
<property>
<name>verifyParam</name>
Missing description.
Same holds as for the previous comment. Anyway if you think it is important to have the descriptions I will add them
I think it is. Again: this PR might just provide a first implementation, but the businness logic around these two parameters exists only in your head. To understand their role I'd need to open the actual job implementation and reverse engeener it. I would appreciate if you could add a description.
Minor changes, please check the comments inline.
@ -1,16 +1,13 @@
At first glance, this class doesn't seem to include any significant change. If it was not changed, please revert to its original formatting. Otherwise the diff just creates noise.
This comment is outdated. It seems you did not issue
git pull
before introducing these further changes thus you did not get the reformatted file CleanContextSparkJob.java.