November 2021 Edition of The Resource
Hello Reader, here is this month’s iRODS news and developments!
I know what you’re thinking - wasn’t there a newsletter at the start of the month?
Yes. Yes there was. I was away the Monday the newsletter was supposed to come out, so I scheduled it in ConvertKit.
Unfortunately… I scheduled it for the first Monday in November, not the earlier week. So you get two newsletters within a few weeks!
Rest assured, though, they have different content. Mostly.
If you’re facing an issue with iRODS you’re not sure how to solve, please do drop me a line; if I’ve come across a solution or seen something relevant elsewhere, I’ll do my best to let you know. Or just drop me a mail to say ‘Hi’. Always nice to hear from people, particularly in these pandemic times!
I’d love your thoughts and feedback on how this newsletter could be better for you.
News
October Development Update
iRODS Development Update: October 2021
Utrecht University Open Science Awards
I can’t find a link to the ADAPT project, which is a shame, since it sounds like just the sort of thing that would go well with YODA. If you’ve found it, let me know?
Psst. Fancy a sneak peek?
I might have been working on a few things recently… Take a look here and also maybe try out this
Main Repository Activity
Open Issues
Non-package install is broken due to shared memory filename collision
I must admit that this issue made me raise my eyebrows somewhat (it comes, I think, from a discussion on the iRODS Chat mailing list). The discussion was around running multiple iRODS servers on the same OS (presumably on different ports), not using containers. Exiting! Terrifying? Something you do? Do let me know…
Atomic metadata update api lookup fails
Worth reading for those who code against the ‘C’ API.
Subsequent rule failure causes an error message suppressed by errorcode to be displayed
‘Legacy’ Rule Engine, reported on 4.2.8.
Make database plugin packages side-by-side installable
Errors on collection names containing apostrophes
This will be fixed as part of #3902
I believe this is a different manifestation of the identical ‘ and ‘ syntax in #4983
We have not demonstrated an SQL injection - it is an error in the GenQuery parser itself, so the client-sent string never makes it to the database interface.
Marking as duplicate, and adding to the list in #3902.
server should generate its own ticket strings
Eliminate calls to python scripts from native code
Define consistent semantics around archive files (ibun and imcoll)
If your organisation uses or plans to use or, then reading and responding to this Issue (which is also an RFC) is a good idea.
provide client connection information to acPreConnect
User requested to be able to determine whether to use SSL or not based on the incoming network connection (internal network or external). Original thread at https://groups.google.com/g/irod-chat/c/3afhUiB2A0k.
This would be done within acPreConnect(), however, there is no connection information available to that PEP to make an internal/external determination. It is handed a manufactured, empty rei.
ireg (and ireg –repl) temporarily registers non-existent files
Investigate memory alignment for fixed_buffer_resource implementation
imkdir -p noisy in logs when collection exists.
Race conditions are racey.
Remove server dependency on irods-grid
irodsctl needs icommands to be in the PATH environment variable
In other news, systemd is the gift that keeps on giving.
Refactor time handling code around delay hints
Fix delay hints parser for “DOUBLE” directives
Fully serialize structs in RE serialization
Server should only accept epoch seconds as timestamps
Remove all static policy enforcement points (PEPs)
If you use the Legacy Rule Engine (and most of us do, at least for basic Zone tasks like checksums), then keep an eye on this issue and report back if you have specific use cases.
Deprecate all static policy enforcement points (PEPs)
See above!
Add macros for check_sent_sid and sign_server_sid
Tickets can accumulate in ICAT with no recourse for admin
iadmin rmticket to remove tickets not associated with any user, to prevent their build-up in the object catalog.
This would be analogous to iadmin rum, which removes metadata AVUs untied to any objects.
Cannot enact data object parallel transfers via a ticket.
irods-server package dependency on irods-icommands package is a tripping hazard
iquest: join AVUs multiple times
Object locking/state issues when iput/irm clients contend heavily for one file (operation locking)
irods breaks MySQL 8 GTID replication
I wonder if similar things lurk for PostgreSQL? I’ve not encountered any for Oracle or PostgreSQL yet, but this makes me wonder.
Release activities for 4.2.11
Not there…. yet.
Are the example rules still valid?
bumping to 4.3.0, so we can cleanly update core.re.template along with a number of other changes.
irsync does not honor “ignore symlinks flag –link”, when symlink is broken
While an open issue, this got bumped as it was reported that has the same issue.
unixodbc version too old for postgres12
Not sure this is something we can fix - it’s on the distribution to package something newer.
Define ACL meaning
Bumped to 4.3…
inconsistent naming in ACL’s
Bumped to 4.3…
icp breaks when data_object name contains “’ and ‘”
Bumped to 4.3…
Postgres Version Compatibility
Bumped to 4.3…
ireg –repl affords multiple replicas on a single resource
4.2.6+
Uploads during rebalance create extra replica
…hello, after 2 more years.
As the implementation of logical locking for creates and writes is now mostly complete, we have realized that this problem can only be solved by what I am calling Operation Locking. I have explained this in detail at the iRODS UGM2021, which can be found here under the title “Logical Locking”: https://irods.org/ugm2021/ Whitepaper forthcoming in the UGM2021 Proceedings.
This feature will require even more sweeping changes in iRODS to get it right, so this will not be included in the 4.2.x series.
This issue is related to #3930. Operation locking is being implemented in #5742.
install libraries into /usr/lib64 on CentOS?
The remaining work on this appears now to be;
As a side note, the library is installed to /usr/lib. I think it’s customary to use /usr/lib64 on a RHEL/CentOS/Fedora system (certainly libRodsAPIs.a is the only library located in /usr/lib on our test RHEL system).
‘ireg –repl’ registers a non-existent file
Updated as it’s getting some fixes.
setup_irods.py should prompt/configure for different/standalone delay server
note: changing setup_irods.py will require changes in the plugins/database/packaging/localhost_setup_*.input files
files have >2 replicas, where one of them is zero length (but not marked dirty)
This issue is related to #4314. This should be resolved with Operation Locking.
client plugin mechanism (irods 4.2.x) causes dynamic_cast to return null and subsequently segfault
related to #3425
GenQuery multi-avu search order is non-optimized
korydraughn added this to the 4.2.12 milestone 5 days ago
Looks like it won’t make it to 4.2.11, alas.
database connection pooling for iRODS
perhaps relevant… https://github.com/yandex/odyssey
Closed Issues
closed on - 2021-11-06 04:11:30 Cannot delete delay rule when REI information in database is large
rsRuleExecDel cannot delete a delay rule because the database plugin does not contain logic for handling columns holding large amounts of data (i.e. r_rule_exec.exe_context).
closed on - 2021-11-06 04:11:49 Incorrect auth check on delay rule deletion
closed on - 2021-11-04 14:10:20 Include key path in exception message on failed server property lookup
closed on - 2021-11-06 04:09:53 Expose case-insensitive / no-distinct search options in query iterator
Related: irods/irods_client_rest_cpp#71
closed on - 2021-10-29 12:23:49 issue in irodsctl start for validation of the configuration files.
A must-read if you use Red Hat/Centos;
It seems to be related to this change: https://letsencrypt.org/docs/dst-root-ca-x3-expiration-september-2021/
Which is breaking for clients that still use OpenSSL 1.0.2, like CentOS 7 does. See https://community.letsencrypt.org/t/openssl-client-compatibility-changes-for-let-s-encrypt-certificates/143816 for details
It can be fixed without disabling schema validation by removing the “DST Root CA X3” root cert from your certifi store. Since certifi no longer provides updates for Python 2.7, this needs to be fixed manually by editing your certifi cacert.pem file (on our distribution that’s /usr/lib/python2.7/site-packages/certifi/cacert.pem; you can use certifi.where() to find it if it’s in a different place on your system).
closed on - 2021-11-05 01:41:35 problems for automatically extract metadata from another host
closed on - 2021-10-27 19:16:11 irods-icommands cannot install for bionic or focal, bad certificate
TL;DR; it will work if you update the package first… before adding the packages.irods.org repository…
closed on - 2021-11-11 21:07:36 new_database_connection()’s parameter should default to false
The following function always reads and parses server_config.json for database credentials. This is unnecessary given that database credentials do not change often.
closed on - 2021-11-11 21:07:30 Allow “iadmin addchildtoresc” to update the parent of a child atomically
iadmin addchildtoresc will result in a CHILD_HAS_PARENT error if the child resource already has a parent resource. Given that this operation is just an update of a single column, there is no risk around allowing to overwrite the previous value of. This change would make performing tree surgery more atomic.
closed on - 2021-10-18 17:29:22 Add regex constraint to negotiation_keys in JSON schema v4
The documentation says:
The 'negotiation_key' must be exactly 32 alphanumeric bytes long.
This was not being enforced in the JSON schema for server_config.json. Also, underscores should be allowed.
closed on - 2021-10-28 11:25:42 Multiple sources of truth in Provider configuration files
closed on - 2021-10-27 20:34:21 where to find for a new provider
closed on - 2021-11-11 01:35:12 Installation documentation lacks information
closed on - 2021-11-11 21:07:47 add atomic api endpoint for catalog operations
closed on - 2021-11-08 18:12:32 iRODS doesn’t handle or report failure of one irodsServer process
closed on - 2021-10-29 00:45:27 returns for a user with groupadmin privileges](https://github.com/irods/irods/issues/4224)
closed on - 2021-10-28 15:12:55 iadmin modresc rebalance failure
closed on - 2021-10-18 17:29:11 consolidate all configuration json into server_config.json
closed on - 2021-10-21 15:02:08 “register” keyword is deprecated
closed on - 2021-11-05 17:25:04 ireg silence failure in registrations of files
Python iRODS Client Activity
Open Issues
Possible better and cleaner ticket implementation
It seems possible both client and server aspects of handling tickets could be more straightforward, from a library perspective.
Investigate:
do queries for a data object in the Python client need to be more complicated when access is granted via their parent collection? Is this due to a mirroring complication in the server?
do icommands handle things more directly when accessing/querying objects via tickets?
how does the server implement tickets, and can it be improved / replaced?
What is the purpose of session.iRODSSession.numThreads?
Since numThreads seems to be used only in messages sent to the server. I’ve noticed this setting being a part of a number of iRODS server data structures, so I’d assume it has more to do with server internals - and possibly to do with the old, high ports implementation of parallel data movement, but none of this impacts’ multithreading within the client.
The new client driven parallel transfer logic that @trel mentioned (aka multi-1247) also uses threads but via the concurrent.futures module. The threading RLock object I believe is there in the Pool logic to prevent multiple threads from grabbing the same server connection. This makes the session object more thread-safe, not that there are any guarantees on the PRC being thread-safe (although it could be cajoled in that direction, if care were taken to keep the threads’ iRODS concerns sufficiently separated.)
method for ‘list tickets’
Some sample code as well if you need this now.
Closed Issues
closed on - 2021-10-27 17:19:25 Collection create_time is not exposed in collection object
Contribution from the community - nice to see!
closed on - 2021-11-10 16:11:18 Implement full ticket functionality
Give PRC a full Tickets API to mirror what is available in iticket and in the iRODS server.
closed on - 2021-11-12 17:13:44 checksum with verification option is broken
Eek! You might want to pull from master if you upload files using the library, until this is release ed part of a new version.
In the present state of iRODS 4-2-stable and PRC, the RError mechanism in the client can no longer be used to return warning messages when the server raises CHECK_VERIFICATION_RESULTS. Because of this, irods.test.data_obj_test.TestDataObjOps.test_verify_chksum__282_287 fails. The client now needs to be taught to trap CHECK_VERIFICATION_RESULTS and receive the appropriate warning messages (regarding, e.g. unchecksummed replicas) from the server.
closed on - 2021-11-08 19:20:40 Get of data object fails when data object is accessible through ticket and by anonymous user
Fixes merged into Master.
closed on - 2021-11-10 21:51:32 Document tickets in README
closed on - 2021-11-10 16:12:07 PRC cannot start anonymous user session from environment files
closed on - 2021-10-26 13:46:27 replace dependency xmlrunner
NFSRods Activity
Open Issues
Combine subsequent calls to logger
This will probably speed it up a bit too?
password appearing in log
Issue copying a file file greater then 55Meg to nfsrods
This is copying using the NFSRODS mount into iRODS. Copying the same out doesn’t appear to have the same issue? However, see #86, below.
No log output on Ubuntu 18
I don’t recall needing to do this for the 2.0 install, so it might be on new installs?
Add parallel transfer
This will significantly speed up transfer operations on files larger than 32 MB.
Closed Issues
closed on - 2021-11-11 22:46:04 NFSRODS needs a way to select the same replica for parallel writes
See #124 above - if you write to iRODS using NFSRODS, it’s worth reading. It’s not clear to me though if the issue is fixed (as there is no linked commit), more that this was a discussion on how to approach the fix? It’s certainly not tagged for having been fixed in a particular version.
icommands Activity
Open Issues
icd not handling special characters correctly
We are using iRODS version 4.2.8. We have a collection whose name has special characters in it, including new lines. When I attempt to use icd to navigate into it from its parent collection, icd changes the current working collection to my home collection and exits without error.
Whew boy. Reminds me of Little Bobby Tables.
YODA Activity
Open Issues
[FEATURE] For EUA make ‘domain’ variable
Thankfully, this also contains a diff of the code required to be changed to allow end users to access the web interface who have email domains that aren’t*.uu.nl
[FEATURE] Federate the iRODS instance behind YODA with another iRODS instance
If you think someone else would appreciate this newsletter, they can sign up at https://theresource.metadata.school/
Two Yaks were shaved in the making of this newsletter.