December 2021 Edition of The Resource
Hello Reader, here is this month’s iRODS news and developments!
Merry festival season all! May your servers stay up and your data flow successfully into the new year.
Or just drop me a mail to say ‘Hi’. Always nice to hear from people, particularly in these pandemic times!
I’d love your thoughts and feedback on how this newsletter could be better for you.
News
log4shell
In case you’ve not heard of the log4J vulnerability, you’ll be relieved to know that so far only NFSRODS is affected, and there is a new release out to fix it
Trirods
I missed it live, so will be looking forward to watching Alan Kind talk about the iRODS Testing Environment, in the latest Trirods presentation.
RENCI Development Update
The https://irods.org/2021/11/irods-development-update-november-2021/ was released. Lots there this month too, from NFSRODS, to Indexing plugin to teasers about 4.2.11…
Breaking News!
As I write this newsletter, RENCI released 4.2.11!
“This release consists of 120 commits from 9 contributors and closed 124 issues.”
Take a look at the link above for full release details.
Main Repository Activity
Open Issues
Multiple plugins make unnecessary copies when fetching their configuration
Work on the 4.3.0 branch.
CMake: Test compiler flags before adding them.
Work on the 4.3.0 branch.
CMake: use CMAKE_DL_LIBS
Work on the… 4.2 backlog branch!
CMake: Use targets instead of variables for OpenSSL
Work on the 4.1 Branch… no, I joke! 4.2 backlog work.
Stopping already stopped service returns error
“The sysv init script on our irods give a non-zero return value if the stopped service is tried to be stopped again.”
People are still usint sysv? How quaint! (Centos 7.9, but applies to anyone who didn’t write their own system user service, and for whom the systemd emulation works ‘out of the box’, which apparently is a thing that happens.)
irods-grid: cannot find libboost_iostreams
4.3 work, so likely fixed long before you get to hit it in production…
ifsck help implies multiple paths on command line allowed
TL;DR - it doesnt actually support that.
debug and source code analysis for irods
As well as some tips for code delving, I wanted to highlight this partcular comment from Terrell;
“We are gathering this type of information here… https://github.com/irods/irods_development_environment
And we will be adding more here as 4.3.0 gets closer… https://github.com/irods/irods_docs/tree/master/docs/developers”
delayed msisync_to_archive that fails is not retried (put back in the delay queue)
Still under active wor/investigation, but;
“Without the fix for #6029 (adding the single quotation marks), the sync_to_archive leaves a stale replica on the archive, which the following stat() cannot find… which returns an error… which is either being squashed on the way back up, or being ignored by the rule engine plugin framework.”
Fix remaining leaks in REPF/PREP
Memory leaks in Python rule engine plugin. Targeted for 4.2.12 fix.
Investigate removing irods_dynamic_cast
For the future, it seems (no target release)
Add support for Rocky Linux 8
If you’ve not heard of Rocky Linux its Rocky Linux is a community enterprise operating system designed to be 100% bug-for-bug compatible with America’s top enterprise Linux distribution now that its downstream partner has shifted direction. aka Centos replacement.
Its not this. Or, sadly, this
Investigate alternative backend storage system for metadata operations
“Externalize the metadata operations into a new pluggable API.
This would allow iRODS to use a different storage engine for metadata (e.g. NoSQL databases or some other technology) with the intent of improved query performance. The issue then becomes, how do we consolidate results between two different storage engines (i.e. permissions, etc.).
Related to #2066.”
If you metadata searches are getting slower and slower due to more and more metadata, voice your support for this feature!
Add build_directory option to build hook
4.2 backlog…
Add support for AlmaLinux 8
OK, I had to look this up, but it’s An Open Source, community owned and governed, forever-free enterprise Linux distribution, focused on long-term stability, providing a robust production-grade platform. AlmaLinux OS is 1:1 binary compatible with RHEL® and pre-Stream CentOS.
apt-key add has been deprecated
4.3 backlog.
Atomic metadata update api lookup fails
If you write code using the C API then this is a great discussion with some working code and pointers and discussion around documentation.
Eliminate calls to python scripts from native code
“These calls should be replaced with C/C++ implementations of the required functionality or removed altogether.”
Remove server dependency on irods-grid
“Currently, irodsctl shuts down the server by calling irods-grid, which is part of the irods-icommands package. We’d like to sever that requirement.”
nlohmann-json externals package is bad
I think the issue should be renamed; “the way the nholmann-json package is managed in the build is bad”
modifying resource context via CLI results in stacktrace
“we’ve now fixed/closed irods/irods_resource_plugin_s3#1983
leaving this open for later confirmation/discussion.”
default_number_of_transfer_threads seems to be ignored
On-going ticket, flurry of discussion, resulting in;
“You can dial back the number of threads by changing that “16” to “default” and the server_config.json setting should be honored.”
On leaf-of-tree resource which is down, imv / irm succeed with error
This is a detailed issue with a lot of back and forth discussion and examples, and a deep dive into how moves and renames work ‘under the hood’. Well worth a read through (a few times!) to get your head round. If you have opinions on how it shoul work (or better, a pull request), chime in on the issue!
msiTarFileExtract (or ibun) attempts to create directories on wrong server
I think the pertinent comment from this issue is;
“It seems like the extraction process is always run on the server associated with the resource that is alphabetically the first in the list of resources where structFile has a copy.”
However if yu are a regular user of ibun, its worth reading this in depth.
www.irods.org has different hostname in certificate
“The website is currently an auto-redirect to irods.org (without www), where the cert is valid. Like you say, the other browsers forward without complaint, and then certify correctly. Safari catches the intermediate hop and complains.”
Issue also notes that using alternative names may also fix this.
irsync feature requests: detecting deleted files
A pretty in-depth discussion of how adding an --delete and then -n (like on rsync, do a dry run) could/should work.
If this is something that would be useful to you (or you disagree with the principles espoused on either side), do chime in in the comments.
rewrite igetwild as compiled icommand
This old issue got rewoken (but not alas, committed to a release to be removed) - TL;DR it has some exiting vulnerabilities, so best to avoid using it.
GenQuery multi-avu search order is non-optimized
My favourite 7.5 year old issue spring back to life, zombie like from the ashes, with a new approach and on track for 4.2.12!
This will make a huge different to large metadata databases!
Closed Issues
Closed on - 2021-12-09 19:53:19 rebalance fails with with lie about file size and dumps stack
Closed on - 2021-12-07 03:31:23 itrim seems not work
The background is that itrim has a fail safe to not go below two replicas. For that you have to use irm. Be very sure!
Closed on - 2021-12-04 23:43:58 iRODS 4.2.10 a failed delayed rule with retries defined on failures does not retry
Closed on - 2021-12-04 23:44:12 iRODS 4.2.10 replication of a file with a space fails in a compound resource using msisync_to_archive
Closed on - 2021-12-02 23:45:04 univmss has swapped values in error message
Closed on - 2021-12-02 19:27:16 ilsresc silently defaults to local zone when given non-existent zone.
Closed on - 2021-12-02 13:51:17 Undefined behavior caused by bug in Boost.Container’s PMR implementation
Closed on - 2021-12-02 13:50:22 fixed_buffer_resource.hpp has undefined behavior
Closed on - 2021-12-07 04:50:46 controller.py loses track of descendant processes if grandpa goes away prematurely during shutdown
“When controller.py tries to stop the server, it will check the pidfile for grandpa’s pid, and use this to find grandpa and descendants that match our server executables. If the server fails to shut down gracefully in a timely manner, it will kill all these processes manually. However, if grandpa shuts down but the descendants do not, controller.py loses track of them and assumes everything is all shut down.”
4.3.0 milestone
Closed on - 2021-11-13 13:52:20 An error occurred using icommand
Invalid request -the user here should be talking to their internal I.T. first.
Closed on - 2021-11-18 12:42:19 Non-package install is broken due to shared memory filename collision
Fixed in 4.2.11, for those people who like to run multiple servers on the same machine (outside of containers! Sounds like a complicated setup to me!).
Closed on - 2021-12-02 13:49:53 Investigate memory alignment for fixed_buffer_resource implementation
Closed on - 2021-11-23 03:23:22 imkdir -p noisy in logs when collection exists.
Closed on - 2021-12-07 04:40:22 irods-server package dependency on irods-icommands package is a tripping hazard
4.3.0 work.
Closed on - 2021-12-02 13:49:46 univMSSInterface.sh missing after upgrade of irods 4.1.x to 4.2.8
Closed on - 2021-12-03 20:49:07 SQL generated by rebalance is too long for GenQuery (too many resources)
Whilst this kiks the can dwn the road for those installations that have lots of resources, its still welcome as not only foes it significantly icrease the limit, but also documents what that limit is.\
Closed on - 2021-08-11 12:08:41 new api plugin - release proxy data object
Closed on - 2021-08-11 12:09:34 new api plugin - create proxy data object
Closed on - 2021-10-28 15:12:55 iadmin modresc rebalance failure
Closed because even the submitter couldn’t reproduce any more (ahem).
Python iRODS Client Activity
Open Issues
Trailing slash in path gives python irods error while the same path works using icommands
The PRC needed to strip trailing slash (as the server seems not to reliably across the API). Sanitise your input’s people!
Session.data_objects.put() overwrites without FORCE_FLAG_KW
(edited for readability - apologies it was a block of text!) TL;DR the upload functionality of the Python library doesnt work the same as iput.
“From team discussion with @korydraughn , @trel , and @alanking earlier today it appears that the iRODS server used to enforce the FORCE_FLAG_KW in the PUT api; since 4.2.9 changes, however, even setting the OprType flag to 1 (PUT) in the data obj OPEN is no longer going to cause the python client (PRC) to hit the PUT api or policy.
Furthermore, multi-1247 will complicate the semantics of this thing.
Currently the PRC goes through an open/write/close sequence when put( ) is called; this is also roughly true of the PRC’s PTE (Parallel Transfer Engine) impl.
Maybe what’s needed – as much for enforcing FORCE_FLAG_KW as for the semantic tying-together of (N > 1) different Open operations on the parallel streams/threads – is to make a higher level API (callable from the Pure Python / XML-irodsProt client endpoint, mind you…) that time-wise brackets and functionality-wise encapsulates all of the lower-level operations on these 1-or-more streams.
This higher level thing might functionally force the whole shebang to be recognized as a PUT operation, or might be roughly equivalent or ancillary to the recently proposed prepare_to_receive API. It’s a complex topic with many parts and needs further discussion.”
irods.exception.CAT_NO_ACCESS_PERMISSION: failed to set access time for [file]
This might be an issue with the storage tiering plugin? Not sure I understand what was going on/wrong from this issue…
Exception ignored upon iRODSSession cleanup
Using anonymous+ticket to get() on multiple files fails where authenticated get() works
password in PAM authentication and irods_environment.json
“If you are unable to (or not wishing to : ) ) have iinit to run on the client platform, something like the below script might work.”
Closed Issues
Closed on - 2021-11-30 14:47:30 Multiple exceptions during put() when irods_path includes “’”
Fixed in 4.2.11
Closed on - 2021-12-01 00:37:45 Quasi-XML parser prevents proper query of unicode object names
Seems uploads were fine but downloads not so much. Fixed in master, waiting on next version.
Closed on - 2021-11-23 19:37:18 ‘pass’ statement together with ‘with’ not closing session
A discussion around how to
Closed on - 2021-12-09 02:23:27 Rule execution should allow the ‘null’ input parameter for the rule_file (.r)
“Once I would like to call any rule (in rule engines) by a rule file (.r) which doesn’t need INPUT parameter via PRC, I am hitting ‘ValueError’.
In other words, if the rule file’s (.r) input parameter is ‘null’, you then get “ValueError: need more than 1 value to unpack” error. I think this is because of L58, 59.
However, if I provide a dummy parameter for the input of .r file, then I can execute the rule but throws a parser error - -1201000.”
Now fixed in master.
Closed on - 2021-12-09 02:23:41 Rule execution should allow specification of a rule-engine instance
Closed on - 2021-12-09 02:24:11 Executing a rule requires reconnection / reauthentication to iRODS
Interesting one this;
“Currently, after executing a rule, the PRC automatically closes its iRODS connection”
but then later in the issue;
“It seems that there is some caching made per connection, the cleanup is the only way (apparently) to get updated information. I am running into this right now, and having to cleanup at every interaction is less than ideal.”
fix appears to be implemted as suggested;
“How about we add a session_cleanup=True parameter to execute()? Or to the Rule class as parameter? So, that when you know that your rule does not require an update of information to the client, you can choose to keep the connection alive. By setting the default to true, the old behavior remains intact.”
Closed on - 2021-12-01 00:37:26 Backtick (`) encoding issue in iRODS XML protocol handling
“When using the XML-based protocol, iRODS encodes backticks as ', deviating from the XML standard. Clients such as PRC and irods-php that use an XML library for decoding iRODS messages incorrectly produce apostrophe (‘) characters wherever backticks are encountered (e.g. file names).
See the original iRODS issue: irods/irods#4132”
Then fixed with;
“So, we can probably close this particular backtick issue as ‘solved’. And we should open a new enhancement issue for PRC to learn to speak the binary protocol upon request (or as a fallback/retry… or perhaps… always?).”]
NFSRODS Activity
Open Issues
provide graceful IP-to-user mapping configuration to allow ‘root’ access in VM
“Squash all remote R/W access from a given IP to a single definable user (e.g., export /p to 10.0.1.2 as user1; export /p to 10.0.1.3 as user2, …) But, also have a way to update the mappings without restarting the NFS daemon and affecting existing connections.”
Its being discussed, so if you have opinions or use cases, please do represent them in the issue.
Combine subsequent calls to logger
Closed Issues
Closed on - 2021-12-10 12:25:03 log4J vulnerability
IMPORTANT If you have any security constraints abut your NFSRODS installs, you should upgrade to 2.0.3 to get the #Log4Shell patch.
Closed on - 2021-11-11 22:46:04 NFSRODS needs a way to select the same replica for parallel writes
“current work can be found in #129
should be merged before too long, will be in the next release.”
Closed on - 2021-12-07 16:56:06 Add “Changes Since X.Y.Z” section to the README
Not sure why this was decided to be not needed? Seemed a good idea to me! Especially as there isn’t a CHANGES.txt or similar…
Closed on - 2021-12-02 20:28:35 Add connection pooling/caching options
Development activity for testing it seems.
icommands Activity
Closed Issues
Closed on - 2021-12-09 17:44:32 ihelp does not mention iunreg
iunreg added in 4.2.9, so prior versions have no knowledge of it.
Externals Activity
Nothing reported this month.
YODA Activity
Nothing reported this month
If you think someone else would appreciate this newsletter, they can sign up at https://theresource.metadata.school/
Two Yaks were shaved in the making of this newsletter.