October 2021 Edition of The Resource
Hello Reader, here is this month’s iRODS news and developments!
If you’re facing an issue with iRODS you’re not sure how to solve, please do drop me a line; if I’ve come across a solution or seen something relevant elsewhere, I’ll do my best to let you know. Or drop me a mail to say ‘Hi’. Always lovely to hear from people, particularly in these pandemic times!
I’d love your thoughts and feedback on how this newsletter could be better for you.
News
Metalnx 2.5.1 Release
Metalnx 2.5.1 is released.
The changelog says, “[#282] Make collections with space in name clickable” - which sounds like a point release to me!
Two new iRODS clients are released!
iRODS C++ REST API v0.8.0 and iRODS Zone Management Tool (ZMT) v0.1.0
The announcement is here and I’m going to quote it. The link has the presentations at the iRODS User group talking about the new features in more depth.
“The iRODS Consortium is pleased to announce the initial release of two new iRODS clients this morning.
One depends on the other, so they make their debut together.”
iRODS C++ REST API
“The iRODS C++ REST API is released today as v0.8.0.
The initial design and implementation was by Jason Coposky, and then refactored and pushed over the finish line by Kory Draughn. It provides a mid-tier REST API (usually to be run on port 80/443) that translates calls to the iRODS Protocol (usually port 1247). This opens up development opportunities widely and we are excited to see what new applications can be built more rapidly with web-standard tools.
Plans for v1.0.0 include possible repackaging or dockerization for possible concurrent/side-by-side deployments of different versions over time.”
iRODS Zone Management Tool (ZMT)
“The iRODS Zone Management Tool (ZMT) is released today as v0.1.0.
This application was designed and implemented by Bo Zhou and Terrell Russell. This ReactJS application provides an administrative web GUI to an iRODS Zone.
Deployment requires a single configuration file pointing to an iRODS C++ REST API endpoint and then docker-compose up.”
2022 UGM Announced!
I think I mentioned this last newsletter, but this bears mentioning twice if so!
- July 5 - iRODS Training
- July 6-7 - Conference Talks
- July 8 - iRODS Troubleshooting
Main Repository Activity
Open Issues
Use more inclusive language
RENCI is moving away from the ‘master/slave’ terminology.
Use better names for queuing functions
Add regex constraint to negotiation_keys in JSON schema v4
“The documentation says:
The 'negotiation_key' must be precisely 32 alphanumeric bytes long.
This was not being enforced in the JSON schema for server_config.json. Also, underscores should be allowed.”
negotiation_key match check is ignored
Worth knowing if you rely on the negotiation key to secure your server against rogue servers (does anyone? Genuinine interested! Let me know!).
Consider adding iadmin sub-command for building resource hierarchies atomically
You might think this is a niche, but if you are (near) constantly ingesting data into your servers, moving resources between the trees can cause a write outage, which often means a change requires planning and announcement. This proposal would mean that the change would happen in one operation, removing that constraint.
irods resource plugin using libcephfs2
In which your author debates the resilience of CephFS vs RADOS. Have an opinion (preferably based on use!)? Please contribute.
irods-devel RPM should declare ‘replaces’ for irods-dev
If you upgrade 4.1.x to 4.2.x on Red Hat/Centos, this is a trip hazard worth knowing beforehand!
Empty negotiation_key in server_config.json crashes agents
using a too long servername should not stacktrace
Beware of using a server with a hostname with a single label longer than 63(+null) characters.
Rewrite database plugin using established ODBC library
I think this is MySQL only, as it refers to [Refactor database plugin for MySQL semantics(https://github.com/irods/irods/issues/4917).
Non empty files registered with size 0 with both correct checksum on a replicated resource
I’ve also seen this, although not using ibun, only in my case it was HAProxy timing out the control connection. I’m hopeful 4.2.10 onwards will also fix that issue.
write ticket for collection does not allow upload of new data objects
Replication fails silently
The replication here applies to invocations of irepl, as reported. I’ve not seen this on 4.2.7, but I note the reporter replicating to a cache resource, which might be related, isn’t something I’ve used, and so could be a confounding factor?
irods-server package dependency on irods-icommands package is a tripping hazard
Better packaging tags, FTW!
irods-runtime deb package must declare relationship with earlier irods-server package due to moved files
There is something of a bit of a ‘how the sausage is made’ view of the build/packaging concerns here! I think this is problematic for 4.2.10 but will be solved in 4.2.11, but I may be misunderstanding the issue.
Overwriting a replica takes much longer than creating a new one
“If the replica is unlinked before creating the new one (thereby overwriting the replica), the slowdown due to overwriting would be avoided.”
iquest: join AVUs multiple times
If you want to do more than simple metadata queries using iquest (it IS possible!), it is worth reading this issue. For now, I would stick with imeta.
Improve error messaging for expired auth token
Excise curl dependency from libirods_common
Various dual password implementation corner cases
“it seems that an effort to work around this was made in one situation by assuming that a timestamp starting 9999% always meant native and was true for all native values. This has the issue that a PAM timeout could theoretically start 9999 as that is an integer. Expanding the match to 9999-% and rolling this out to everywhere R_USER_PASSWORD is touched would seem to be the smallest fix, though adding an auth_scheme column to R_USER_PASSWORD and working from that would be cleaner.”
Converting keyValPair_t value to path serializes entire keyValPair_t object
‘Legacy’ Rules Engine.
multiple irepls to the same tree - launched before the first completes
“I believe with 4.2.9+, the second and third concurrent irepl call in this scenario will be blocked and return immediately since that data object is logically locked. “
However.
“If they are launched truly in parallel, the database race condition described here still applies until we fix it: #5742 “.
iquest fails after (TCP timeout + 60 sec + querytime)
The cause of the reported issue turned out to be networking issues on the client host due to an OpenStack Hypervisor migration.
Blank Origin and Label fields in packages.irods.org Release/InRelease files
iRods RPM breaks common expectations for packages
How does your site/install do it? Do you rely on the user-created by the package, or are you happy for the setup to do so?
CMake consistency sweep for 4.2.11
cyberduck and irods file transfer
An ongoing issue, but a recent update;
“I’ve created a new issue to not send the file data itself (the buf) to AMQP by default.
irods/irods_rule_engine_plugin_audit_amqp#61”
iquest cannot match names containing apostrophes via = operator if it’s not the last condition in the where clause
iphymv error when moving replica into a composite tree
“trel added the consortium-member label 11 days ago.”
optimization/atomicity when manipulating resource hierarchies
“trel added this to the 4.3.0 milestone 25 days ago.”
Use irods::filesystem status information in lsUtil beyond collection/data object determination
“trel added this to the 4.3.0 milestone 25 days ago.”
Allow resources to request single thread transfers
“trel added this to the 4.3.0 milestone 25 days ago.”
add atomic api endpoint for catalog operations
“trel added this to the 4.3.0 milestone 25 days ago.”
ifsck should have a way to ignore subdirectories
“trel added this to the 4.3.0 milestone 25 days ago.”
pam_password_max_time does not allow pass_expiry_ts above 1209600
“trel assigned korydraughn 25 days ago.”
[ilsresc -l and iadmin lr return parent id with no easy tool to convert to a name](https://github.com/irods/irods/issues/5069)
“trel added this to the 4.3.0 milestone 25 days ago.”
Separate authentication step in connection pool
“trel added this to the 4.3.0 milestone 25 days ago.”
Support for Ubuntu 20.04
“trel added this to the 4.3.0 milestone 22 days ago.
trel assigned SwooshyCueb 22 days ago
‘ireg –repl’ registers a non-existent file
“alanking added this to the 4.2.11 milestone 29 days ago.”
‘in’ in IN() is INVALID
“trel unassigned jasoncoposky 25 days ago” :-(
pam_password_max_time not working
“trel assigned korydraughn 25 days ago.”
Rename irodsReServer to irodsDelayServer
“trel assigned korydraughn and unassigned alanking 10 days ago.”
genQuery uses ordered strstr() to find where condition keywords
“trel unassigned jasoncoposky 25 days ago.”
add -r option to ireg
I have no idea why this issue got marked as updated by GitHub!
imeta search order is non-optimized
An oldie but a goodie. Very important if you have lots (millions) of metadata to search.
Closed Issues
closed on - 2021-10-15 06:20:23 test issue
The test worked!
closed on - 2021-10-14 20:07:51 server_properties::map() should return reference to member variable
closed on - 2021-10-12 00:28:04 define IRODS_EXTERNALS_FULLPATH_PISTACHE in irods-dev package
closed on - 2021-10-14 20:07:19 Main server is forking a new delay server process every 5 seconds
closed on - 2021-09-28 06:50:48 irods Server automatically open multi-hundred process a night
“I have found the bug, it is due to configuring metalnx [job] parameter incorrectly.
Each Metalnx seems holding one connection to irods server indefinitely, as alan said. And ips cannot recognize it
leaving UNKNOWN connection type.”
closed on - 2021-09-23 12:24:45 Capture path of iRODS externals library - Spdlog
closed on - 2021-09-28 03:05:30 curl dependency and userspace packaging
closed on - 2021-09-28 16:04:12 multiple irepls to the same tree - launched after the first completes
closed on - 2021-09-23 00:59:38 irods recompile and delete directly from filesystem
closed on - 2021-10-14 16:02:25 Crash over long collection name
Python iRODS Client Activity
Open Issues
Get of data object fails when data object is accessible through ticket and by anonymous user
Document tickets in README
PRC cannot start anonymous user session from environment files
Long PAM password/token string causes PACKSTRUCT error
What is the purpose of session.iRODSSession.numThreads?
“Since numThreads seems to be used only in messages sent to the server. I’ve noticed this setting being a part of a number of iRODS server data structures, so I’d assume it has more to do with server internals - and possibly to do with the old, high ports implementation of parallel data movement, but none of this impacts multithreading within the client.
The new client driven parallel transfer logic that @trel mentioned (aka multi-1247) also uses threads but via the concurrent.futures module.
The threading RLock object I believe is there in the Pool logic to prevent multiple threads from grabbing the same server connection. This makes the session object more thread-safe, not that there are any guarantees on the PRC being thread-safe (although it could be cajoled in that direction, if care were taken to keep the threads’ iRODS concerns sufficiently separated.).”
Executing a rule requires reconnection / reauthentication to iRODS
replace dependency xmlrunner
Closed Issues
closed on - 2021-09-24 14:39:13 Put operation ignores kw.FORCE_FLAG_KW flag
”
So that I'm clear on the decision - put ignores the flag by design?
Yes.
As I understand it…
A PRC put is not an iRODS put… but rather an iRODS open/write/close. And the force flag is not honored on the iRODS open/write/close series of operations.
So, yes, #285 is the reason your sess.data_objects.put() is failing.
Your use case for “data need updating and the final data object name must not change” should work fine for production… once we get #285 fixed up.”
closed on - 2021-09-24 22:03:56 Put operation over an existing object raises a KeyError (-406000 locked data object)
NFSRods Activity
Open Issues
NFSRODS needs a way to select the same replica for parallel writes
Tests were added to confirm the fix in 4.2.11.
icommands Activity
Open Issues
iquest: Retrieving creation time and modify time of tickets fails
Retrieving TICKET_CREATE_TIME or TICKET_MODIFY_TIME via iquest hasn’t worked for ten years, it seems!
YODA Activity
Open Issues
[FEATURE] davrods on different host than yoda
If you think someone else would appreciate this newsletter, they can sign up at https://theresource.metadata.school/
Two Yaks were shaved in the making of this newsletter.