The Resource logo

The Resource

Archives
Subscribe
May 23, 2022

May 2022 Edition of The Resource

Hello Reader, here is this month’s iRODS news and developments!

If you’re facing an issue with iRODS you’re not sure how to solve, please do drop me a line; if I’ve come across a solution or seen something relevant elsewhere, I’ll do my best to let you know. Or just drop me a mail to say ‘Hi’. Always nice to hear from people, particularly in these pandemic times!

I’d love your thoughts and feedback on how this newsletter could be better for you.

News

“Managing Petabytes of Data Using iRODS.”

During the Great Plains Network Annual Meeting 2022 in Kansas City, Senior Applications Engineer Justin James will host a breakout session called “Managing Petabytes of Data Using iRODS.”​

"Large collaborative research projects present a myriad of challenges for those who are responsible for maintaining the IT infrastructure.  Researchers need their data to be discoverable without necessarily being burdened by having to know exactly where and how their data is stored, replicated, etc.  Project managers need the ability to set policies on the data - how it is accessed, how long it must be saved, and when it should be archived.  These policies are driven by both internal project considerations and external governmental regulations.  Managers may want an audit trail of when data is created, accessed, or deleted.  The workflow may require the automatic processing of data when it is created from external sources, such as microscopes, satellites, remote monitoring sites, etc.
iRODS (Integrated Rule-Oriented Data System) was created to solve these and other similar problems. With its virtual filesystem, metadata catalog, rule engine, and federation capabilities, iRODS can help administrations manage their petabytes of data.
The presenter will discuss how iRODS solves the problems listed above and explain the configuration-not-coding paradigm that has been created to solve many of the common use cases encountered in the iRODS user community. He will give a demonstration of the use of iRODS to store files in heterogeneous storage systems, both local and cloud based, emphasizing the parallel nature of S3 uploads, including performance comparisons between the iRODS S3 plugin and the Amazon CLI utility. "

UGM 2022

Registration for the 14th Annual iRODS User Group Meeting is open!

This year’s meeting will take place July 5-8 + will be hosted by KU Leuven.

Abstracts and Talks

Interested in presenting at the iRODS User Group Meeting 2022?

Abstracts for talks / papers are due June 1 to be considered.

ISC HPC

iRODS will host a booth at this year’s ISC HPC Conference in Hamburg. Executive Director Terrell Russell will be available for questions and conversations at Booth F607.

Main Repository Activity

Open Issues

​Replace time.clock with time.process_time in python tests​

4.3.0;

"According to the documentation, time.clock was deprecated in python 3.3 and it seems it was removed in python 3.8.
The equivalent call seems to be time.process_time.
More information here: https://docs.python.org/3.7/library/time.html#time.clock"

​release activities for 4.3.1​

Steady… This just means that 4.3.0 is right around the corner! Like the UGM…

​CAT_SQL_ERR On attempting to make an identical metadata/object ID association​

4.3.0? Likely earlier?

The TL;DR for me is that is you run Ubuntu 20.04 or unixODBC >= 2.3.6 you may get some weird behaviour setting metadata or overwriting files (and setting metadata at the same time), if

"Overwrite a data object with the same metadata AVUs attached.
This is being caused by an insert SQL statement being performed and violating the duplicate key constraint (i.e. the same metadata ID is being associated with the same data object ID in the catalog). This error is actually happening even in ubuntu 18.04 on inspection of the ODBC trace logs, but an error is not raised presumably due to a bug in unixODBC. This error is now being raised, so we need to make sure we check before inserting.
The unixODBC version on ubuntu 20.04 is 2.3.6. The version on ubuntu 18.04 is 2.3.4."

​iRODS server process needs to remove leftover shared memory files on startup​

4.2.12;

"Today, the startup scripts (i.e. irodsctl, etc.) are responsible for cleaning up leftover shared memory files. This responsibility should belong to the iRODS server process."

​check/fix permissions required for rename (imv)​

Follow on to #3106; “that’s a small enhancement: one user was not able to rename (imv) a collection on which he had “modify” rights: he needs to have “own” rights on the collection to do this operation. It should be added in the ichmod help menu.”

​Consider making iinit prompt user about use of SSL​

4.3.1 - Related to icommands #56 “Include a question if default SSL parameters in ~/.irods/irods_environment.json should be included. This is required to communicate with a Server with irods_client_server_policy=CS_NEG_REQUIRE.”

​iquest doesn’t fail on some invalid queries, but returns incorrect result​

found on 4.2.7, targeted fix 4.3.1

"I (accidentally) passed an invalid query to iquest which was missing a closing single-quote."

​Several iCommands do not disconnect on authentication failure​

Targetted for 4.2.12 and it’s quite the list!

​Update documentation regarding replica statuses​

Targetted for 4.2.12;

"The help text for ils lists and describes the various statuses which a replica can have. The integer values in the list do not correspond to those which appear in the catalog. It should look like this:
    stale
    good
    intermediate
    read-locked
    write-locked
The read-locked entry does not need to be fully fleshed out because it has not been implemented, but at least the values will be correct.
The documentation for troubleshooting objects stuck in a locked state should detail the allowed values for iadmin modrepl as well."

​Remote psql server connection over SSL​

A excellent issue report talking about how the author got SSL support working with PostgreSQL and iRODS, as the normal setup didn’t configure it quite right.

Worth looking at if you need this kind of setup. Seems a shame it’s not scheduled for a particular release, given the detail the report goes into. Perhaps when the UGM is over and 4.3.0 is out the door!

​The pep_database_mod_data_obj_meta_* PEPs not called when new file uploaded ​

4.2.11, targetted for 4.2.12 fix.

TLDR; “With the introduction of logical locking, we failed to realize that database PEPs weren’t being triggered due to direct use of nanodbc.

We will restore this ability in 4.2.12, but not through that PEP. We will provide more details once work on 4.2.12 picks up.
Below are a couple ways to get around this limitation:
Use the pep_api_* PEPs
Implement a sweeper that finds all replicas without a checksum and process them"

​Add commonly referenced strings to a header file for authentication​

​Data object stuck in locked/intermediate status when ‘agent stop’ network plugin operation fails​

4.2.11, targetted for 4.2.12 fix.

"When the 'agent stop' network plugin operation fails on agent teardown, the cleanup() step fails to occur (and some memory leaks as well)"

and

"We should be freeing the memory and calling cleanup() no matter what to ensure that no memory leaks and data objects do not get stuck in the locked/intermediate status"

This seems to have affected users of the Globus plugin, not sure how much impact it has for all users of a 4.2.11 system?

​Potential leak in auth plugins​

4-2-stable, targetted for 4.2.12 fix.

​Consider exposing timeout options for controlling delay server migration communication​

4.3.1 “Depending on the network, admins may need to adjust the timeout values so that delay server migration operates correctly.”

I certainly support having this tunable with sensible, documented defaults.

​iscan Usage against local object seems not working​

Documentation update for 4.2.12;

"Then it seems the documentation below a bit confusing -or at least for me! How I understand the document is like 'a local file and/or a directory is an object and path on where I upload it to irods'. Could iscan -h perhaps include an usage example later? Thank you."

​ifsck generates only chkObjConsistency​

A regression in 4.2.10 (perhaps earlier?), targetted for 4.2.12 fix.

​Allow multiple arguments for ‘in’ operator of imeta qu​

No release targetted for a fix.

"aka, using the IN (?, ?) SQL syntax"

​Workaround​

​ils terminates with uncaught exception on SSL_HANDSHAKE_ERROR​

4.2.11, targetted for 4.2.12 fix.

​iadmin suq Help text for docs.irods.org is confusing​

Targetted for 4.2.12 fix.

" Should be;
suq User ResourceName Value (set user quota)
Set a quota for a particular user for a resource.  Use 0 for the value
to remove quota limit.  Value is in bytes.  As with other sub-commands,
'user' is of the form userName[#zone] where the local zone is default.
Also see sgq, lq and cu."

​irodsMonPerf.config and rodsMonPerfLog are being ignored.​

OK, I knew there was something like this, but I thought it was set up just for single servers. Then I went documentation hunting and couldn’t find anything,, so I sort of high jacked the issue… ;-)

I wonder if Mr Edgin will do a talk on his investigations at UGM this year? He’s been looking into all kinds of interesting things, all pretty much leading edge in the iRODS space, so I’d really appreciate it! If he does do one, do attend, his talks are always interesting!

​Flesh out common CPack definitions CMake module​

Closed Issues

Closed on - 2022-05-17 20:37:33 iquest wrongly gives valid answer to incorrect or impossible cross-zone query​

Found on 4.2.7, fixed in 4.2.11

"iquest should report that cross-zone queries are not possible, or that the zone does not exist; and exit with an error message and non-zero return code."

Closed on - 2022-05-18 19:42:56 install(DIRECTORY) for scripts picks up pycache and friends, and clobbers permissions​

Closed on - 2022-05-18 19:42:55 More bad path assumptions​

Closed on - 2022-05-18 19:42:55 paths.py does not use install directories set in CMake​

Closed on - 2022-05-16 15:19:52 Remove support for password argument from iinit​

4.3.0

"iinit allows the user to supply a password as an argument which will be used in lieu of the password prompt. This should be removed as it is insecure and does not add anything that cannot be accomplished through other means (i.e. scripting)."

Closed on - 2022-05-03 12:57:14 Add old submodule directory to .gitignore​

Closed on - 2022-04-27 21:59:38 streamline github actions​

Closed on - 2022-04-29 17:47:56 Server must not run delay server migration checks when shutting down​

4.3.0

"Because the delay server migration logic runs asynchronously, it is possible for network requests to be executed while shutdown is in progress.
This results in the shutdown logic having to send SIGKILL to all remaining processes in the worst case.
Still, we need to adjust the migration code so that shutdown is handled gracefully.
Adding a server status check before executing any network request should resolve this."

Closed on - 2022-04-25 23:10:30 Remove python dependency declarations from runtime package​

Closed on - 2022-05-03 12:56:14 chown_directories_for_postinstall.py stomps on msiExecCmd_bin ownership​

Closed on - 2022-04-25 20:33:34 icommands package does not adjust file/dir permissions properly​

Closed on - 2022-04-25 23:10:29 Add python3-distro to package dependencies for server package​

Closed on - 2022-04-28 20:32:40 skip tests requiring NREP when NREP status not known​

Closed on - 2022-04-27 14:43:32 Remove irm -n​

4.3.0

"Deprecated in 4.2.3: 344426e"

Closed on - 2022-04-25 20:32:34 Add convenience function for setting the “ips” name of a connection (C/C++ clients only)​

Closed on - 2022-04-25 20:32:11 Make server state available across multiple processes​

Closed on - 2022-04-25 20:32:43 CRON manager allows exceptions to escape its processing loop​

Closed on - 2022-04-28 20:25:33 allow building irods server RPM for 4.3.0 testing​

Closed on - 2022-05-18 17:20:14 Update processes around unattended installations​

Closed on - 2022-05-12 18:52:03 CentOS 7 requires pyodbc externals package​

Closed on - 2022-04-27 18:51:59 Packaged python files with python3-only syntax cause packaging to fail on some Fedora-based systems​

Closed on - 2022-04-25 19:30:34 Ensure namelinks are only in irods-dev package​

Closed on - 2022-04-27 18:51:49 CPack fails to byte-compile python3 syntax on CentOS 7​

Closed on - 2022-05-06 20:54:27 Consider renaming functions dealing with shared memory in rodsServer.cpp​

Closed on - 2022-05-20 16:32:26 Ensure icommands userspace tarball packaging still works for 4.3​

Closed on - 2022-04-27 13:55:25 Organize and establish conventions for irods_configuration_keywords.hpp​

Closed on - 2022-05-19 20:48:04 Killing the agent factory (irodsServer) results in leaked processes when restarted​

Closed on - 2022-05-19 20:48:26 Killing the delay server (irodsReServer) kills the main server​

Closed on - 2022-05-16 16:45:37 Bump externals in jenkins CI hook scripts​

Closed on - 2022-04-27 21:59:47 Add support for Rocky Linux 8​

Closed on - 2022-05-16 16:45:29 Add new options for build/test hooks​

Closed on - 2022-04-27 22:02:45 Add support for AlmaLinux 8​

Closed on - 2022-05-10 19:06:18 Deprecate all static policy enforcement points (PEPs)​

Closed on - 2022-05-16 13:16:17 Use more inclusive language​

Closed on - 2022-05-16 15:48:50 Add support for optional server configuration service endpoint​

Closed on - 2022-05-18 00:47:07 maximum_number_of_concurrent_rule_engine_server_processes seems not to work​

Closed on - 2022-05-19 19:50:04 Refactor authentication plugin interface leveraging Authentication Working Group design​

Closed on - 2022-05-19 02:25:09 clean up old/legacy/outdated rulefiles and unused microservices​

Closed on - 2022-05-19 02:19:00 Define ACL meaning​

Closed on - 2022-05-19 02:18:07 inconsistent naming in ACL’s​

Closed on - 2022-04-25 20:33:34 Allow migration of the delay server​

Closed on - 2022-05-02 19:31:13 Define a summary and description for the rpm package built by CPack​

Closed on - 2022-04-27 14:43:20 Remove deprecated -U option from irm​

Closed on - 2022-05-14 21:24:53 remove all deprecated msiSys* microservices​

Closed on - 2022-05-19 13:59:59 Remove user quota functionality​

Closed on - 2022-05-19 02:37:35 setup_irods.py should support high availability (HA) (multiple providers)​

Closed on - 2022-04-27 14:42:58 Remove unused rule submit tags​

Closed on - 2022-05-11 16:57:45 setup_irods.py should prompt/configure for different/standalone delay server​

Closed on - 2022-05-11 16:23:10 unsolicited debug output “Level 0” upon PEP with on clause​

Closed on - 2022-05-07 19:58:53 Change default server SSL policy from CS_NEG_DONT_CARE to CS_NEG_REFUSE​

Closed on - 2022-05-19 02:16:01 enhancement of ACLs documentation (ichmod)​

Closed on - 2022-05-18 13:23:59 irsync problem​

Closed on - 2022-04-28 17:48:13 imeta syntax for IN queries is not listed in documentation​

Closed on - 2022-05-10 18:40:48 password visible in debug log while using PAM and SSL​

Closed on - 2022-04-27 22:00:05 Remove messaging schema submodule​

Closed on - 2022-05-11 16:21:00 Current behaviour when doing recursive iputs involving symlinks is possibly surprising.​

"Work done for #3988, #4009, #4013 for 4.2.4 appears to cover all these cases (see 3e6c08a)."

Closed on - 2022-05-19 02:11:20 change ichmod to handle the full range of the iRODS permission model​

I think this is rolled out in 4.3.0, but I will be looking at the ACL docs on release if it’s not called out in the notes -potentially very useful!

"We exposed/increased the Permission Model to 10 levels, not the original/described 18."

Well worth looking through this if you care about a more nuanced access control than currently available. Also, for the awesome comment which is the standard all others should be judged by, including mine!

Python iRODS Client Activity

Open Issues

​allow direct calls to set via object handles​

set Here being to set metadata aka imeta set.

​PAM authentication fails when there is an ‘=’ character in the password​

Closed Issues

Closed on - 2022-05-20 12:47:30 Atomic metadata operations don’t work with the group permission​

"In case a user has own access only provided by the group permission, then this user cannot add metadata via the atomic operations."

and

"closing this as related to [irods/irods#6191](https://github.com/irods/irods/issues/6191) makes sense?"

Closed on - 2022-04-29 11:54:58 Query object doesn’t throw CAT_NO_ROWS_FOUND once required​

"Yes, the PRC is less likely to output incidental (though possibly helpful) things to the console, unlike the icommand clients.... Are you attempting a substitution of a Python client script for an iquest command? You could always test the length of the query by using
    (a) list( results ) or
    (b) list( itertools.islice (results, 0, 1) )
in the Python, and printing CAT_NO_ROWS_FOUND if the resulting list were zero-length.
(Caveat: the query iterator would then resume from that point with, respectively, either (a) all rows or (b) one row having been consumed)"

externals Activity

Closed Issues

Closed on - 2022-05-18 14:10:20 Compiling for Debian 11​

Closed on - 2022-05-18 18:44:03 .yum.repo files construct bad baseurl for rocky/alma/ubi​

If you think someone else would appreciate this newsletter, they can sign up at https://theresource.metadata.school/​

Some Yaks were shaved in the making of this newsletter.

Don't miss what's next. Subscribe to The Resource:
Powered by Buttondown, the easiest way to start and grow your newsletter.