Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add console commands for Export, Import. Including chunking. Improve Ticket import. #3682

Closed
bschmalhofer opened this issue Aug 6, 2024 · 22 comments
Assignees
Labels
enhancement New feature or request ImportExport Related to Import and Export of data
Milestone

Comments

@bschmalhofer
Copy link
Contributor

bschmalhofer commented Aug 6, 2024

ImportExport is integrated in OTOBO 11.0 and OTOBO 11.1. But there are no console commands that perform Import and Export. So let's provide them. The new commands will be almost exact copies of the commands in ITSMConfigurationManagement, https://github.com/RotherOSS/ITSMConfigurationManagement/tree/master/Kernel/System/Console/Command/Admin/ITSM/ImportExport .

Support for chunking will be added to the commands and to the Translations object backed. Chunking the translations is basically only meant as an example, as it is unlikely that translation will exhaust the machines memory.

Do not add the commands in rel-11_0, as patch releases should have no new features.

@bschmalhofer bschmalhofer added the enhancement New feature or request label Aug 6, 2024
@bschmalhofer bschmalhofer added this to the OTOBO 11.1 milestone Aug 6, 2024
@bschmalhofer bschmalhofer self-assigned this Aug 6, 2024
bschmalhofer added a commit that referenced this issue Aug 6, 2024
to be used for ImportExport
bschmalhofer added a commit that referenced this issue Aug 6, 2024
to be used for ImportExport
@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Aug 6, 2024

Tentatively added the commands.

TODO:

  • expand test scripts to actually export and import translations
  • support chunked export
  • support chunked import
  • make final decision on the naming conventions, @svenoe
  • manual test

Additional fixes to Import Export:

  • Do not count the writing of the header line as a successful export
  • add a line break after the last line of the export

@bschmalhofer bschmalhofer modified the milestones: OTOBO 11.1, OTOBO 11.1.1 Aug 6, 2024
bschmalhofer added a commit that referenced this issue Aug 8, 2024
Try to consistently spell SQL keywords in upper case
bschmalhofer added a commit that referenced this issue Aug 8, 2024
Try to consistently spell SQL keywords in upper case
@bschmalhofer bschmalhofer added the ImportExport Related to Import and Export of data label Aug 8, 2024
@bschmalhofer bschmalhofer changed the title Provide the console commands Admin::ImportExport::Export and Admin::ImportExport::Import Add commands Admin::ImportExport::Export and Admin::ImportExport::Import, including chunking Aug 9, 2024
bschmalhofer added a commit that referenced this issue Aug 9, 2024
This allows concatenation of CSV files
bschmalhofer added a commit that referenced this issue Aug 9, 2024
mostly as an example that this can be done.
The pragma make the character class `\d` behave like `[0-9]`.
@bschmalhofer
Copy link
Contributor Author

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Aug 9, 2024

Another thing to discuss:

The new files Kernel/System/Console/Command/Admin/ImportExport/Import.pm and Kernel/System/Console/Command/Admin/ImportExport/Export.pm are planned to be added to OTOBO 11.1. For OTOBO 11.0 the plan is to backport these files to the ImportExportTicket package.

It can be considered to integrated the two files from ImportExportTicket into OTOBO 11.1.

<Filelist>
        <File Permission="660" Location="Kernel/Config/Files/XML/ImportExportTicket.xml" />
        <File Permission="660" Location="Kernel/System/ImportExport/ObjectBackend/Ticket.pm" />
</Filelist>

This would allow to mark ImportExportTicket simply as integrated, without having to consider the two different cases.

bschmalhofer added a commit that referenced this issue Aug 14, 2024
This allows concatenation of CSV files
bschmalhofer added a commit that referenced this issue Aug 14, 2024
mostly as an example that this can be done.
The pragma make the character class `\d` behave like `[0-9]`.
bschmalhofer added a commit that referenced this issue Aug 29, 2024
on a per ticket basis, not at the end of processing
bschmalhofer added a commit that referenced this issue Aug 30, 2024
as the return value of TemplateGet() might be undefined
bschmalhofer added a commit that referenced this issue Aug 30, 2024
because the config changes are specific for the ImportExport of tickets.
bschmalhofer added a commit that referenced this issue Aug 30, 2024
as the return value of TemplateGet() might be undefined
bschmalhofer added a commit that referenced this issue Aug 30, 2024
because the config changes are specific for the ImportExport of tickets.
@bschmalhofer
Copy link
Contributor Author

The adaptions were up to now done for OTOBO 11.1.x. But that is currently not very useful as this version is not released yet. So let's backport it to OTOBO r11.0.x. As no new features should be added in a patch level release, lets include it into the master branch of https://github.com/RotherOSS/ImportExportTicket.

@bschmalhofer
Copy link
Contributor Author

Some more analysis about the core dump. The suspicion is that the stack size or 8 MB is running out. This could be due to XS-code. Two candidates are Text::CSV_XS and the database driver.

It does not look like the reading of the CSV file is the problem. The 209661 lines, corresponding to 50000 entities, of the test iinput can be read in without problems. It is just not clear why the whole file must be read in before parsing begins.

It also looks like the chunking is not really the problem, when importing the combined file the core dump happens after importing 8735 tickets. That is the same ballpark as with chunked import.

Another wild guess. Kernel::System::DB::Connect() creates an anonymous sub in every call. This sub is usually not needed and it is presumable cleaned up, because it is stored in a lexical variable. But maybe something fishy is going on there.

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Aug 30, 2024

The wild idea migth have had an effect. With a small code change in Kernel::System::DB::Connect() the sample file with 10000 tickest could be imported. The stack size grew slowly.

$ pidstat -s -l -C perl
19:44:32     1000    475242    1256    1252  perl bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 Export_tickets_20270828a.csv

But after the last imported ticket there was still a core dump. I suspect that Kernel::System::Ticket::Article might still have hoarded transaction events.

TODO:

  • watch stack size without the patched Connect() method.

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Aug 31, 2024

Another experiment. First I rolled back the changes in Kernel::System::DB::Connect() ran the case with 10000 tickets again and watched the stack size watch -n 0.2 pidstat -H -s -l -p 15340 .

All 10.000 tickets were imported. During the imports the stack size grew in the same way as without the adapted DB module.

Alle 2,0s: pidstat -H -s -l -p 15340 bernhard-Aspire-A515-57: Sat Aug 31 10:30:03 2024

Linux 6.8.0-40-generic (bernhard-Aspire-A515-57) 31.08.2024 x86_64 (12 CPU)

1725093003 UID PID StkSize StkRef Command
1725093003 1000 15340 1172 1172 perl bin/otobo.Console.pl Admin::ImportExport::Import --template-number 1 Export_tickets_20280828a.csv

As the end I briefly say a stack size of over 6000 kB and core was dumped. So this is the first real indication that the stack size limit is reached, presumable during global destruction. The ulimit for stacksize, unlimit -s, is 8192 kB. So core is dumped, process is terminated, when that limit is reached.

bschmalhofer added a commit that referenced this issue Aug 31, 2024
for some articles the transactions will be handled before creating
an article for the next ticket or during global desctruction
bschmalhofer added a commit that referenced this issue Aug 31, 2024
for some articles the transactions will be handled before creating
an article for the next ticket or during global desctruction
@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Aug 31, 2024

Activated the early triggering of the article transactions. Article transaction are handled just before the next article is created. Some observations:

  1. Ticket creation become a factor of about 5 slower
  2. Used stack size now grows faster during ticket and article creation
  3. Core is dumped after creation of a bit over 7000 tickets, after 8192 is reached

TODO:

  • find out why the stack size increases, whether it is only the transaction events

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Sep 1, 2024

More experiments.

Removing all event handlers still let the stack size grow. So the problem is likely not in one of the event handlers.

Disabling Kernel::System::EventHandler::EventHandlerTransaction() did the trick.

diff --git a/Kernel/System/EventHandler.pm b/Kernel/System/EventHandler.pm
index c6b66c80f9..fdfc98b0bf 100644
--- a/Kernel/System/EventHandler.pm
+++ b/Kernel/System/EventHandler.pm
@@ -276,6 +276,8 @@ Kernel::System::EventHandler, like this:
 sub EventHandlerTransaction {
     my ( $Self, %Param ) = @_;
 
+    return 1;
+
     # remember, we are in destroy mode, do not execute new events
     $Self->{EventHandlerTransaction} = 1;

The stack size is no longer growing. So disabling EventHandlerTransaction() is recommended for the upcoming test migration.

The reason why the stack size increases is not obvious. I likely has to do with the localized object manager which is destroyed and EventHandlerTransaction() is called again in the destructor.

TODO:

  • create a small test programm the replicates the growth of the stack

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Sep 2, 2024

Turns out that this is apparently caused by binmode which is called in the constructor of Kernel::System::Encode. Does likely not affect the web app, as binmode is only called when an interactive tty is attached.
test_stack_size.pl.txt

The fix is to avoid recreating Kernel::System::Encode.

TODO:

  • test the fix

bschmalhofer added a commit that referenced this issue Sep 2, 2024
because calling binmode often exhausts the stash and causes core dumps.
Relevant for console commands that often call EventHandlerTransaction().
bschmalhofer added a commit that referenced this issue Sep 2, 2024
because calling binmode often exhausts the stash and causes core dumps.
Relevant for console commands that often call EventHandlerTransaction().
@bschmalhofer
Copy link
Contributor Author

The core dump issue looks like it is solved. Got about 300 kB Stacksize after importing 10000 tickets via the console command.

@bschmalhofer
Copy link
Contributor Author

bschmalhofer commented Sep 3, 2024

TODO:

  • sync changes done in the importexportticket package

@bschmalhofer
Copy link
Contributor Author

Harald Jörg kindly analyzed the behavior of binmode, http://www.perl-community.de/bat/poard/thread/21619#ms_196692 . I/O work in Perl in layers, so every call to binmode adds a layer. This is more or less stated in https://perldoc.perl.org/PerlIO . So this is not a bug in Perl, but a consequence of the design.

bschmalhofer added a commit that referenced this issue Sep 4, 2024
quite sensible, that last line of the CSV export now has a newline
at the end of the line
bschmalhofer added a commit that referenced this issue Sep 4, 2024
quite sensible, that last line of the CSV export now has a newline
at the end of the line
@bschmalhofer
Copy link
Contributor Author

ImportExportTicket has been integrated into OTOBO 11.1. Closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ImportExport Related to Import and Export of data
Projects
None yet
Development

No branches or pull requests

1 participant