Monday, August 26, 2013

FileEvent - Example 5 - Complex Pattern Matching

Introduction

So far the handling of patterns by the previous FileEvent examples has been useful, but probably unremarkable. As of the recently released version 1.0.1 this functionality has been extended only slightly; but with a big impact.

Sometimes it is often useful to determine the destination based on patterns from the the filename rather than just the filename itself. For example consider the following schematic naming:

<environment>.<instance>.<filename>

For example:

P.00.mytestfile
T.01.anotherfile
T.02.afile
D.00.testfile1
D.01.testfile1

In the above examples the “P” represents “production”, “T” represents “test” and “D” represents “development”. Since many organisations have multiple parallel development streams the second component of the filename indicates the instance of that environment.

Improved Macro Handling

Assuming that all incoming files land in the same directory you might want FileEvent to move files into appropriate directories for the environmetn based on the filename. With the facilties introduced so far this would be difficult; but not now with support for putting partial pattern matches into variables via the following syntax:

%{varname,pattern}

We can use this facility to break down the source filenames into different parts, for example:

<file_pattern>%{myenv_type,[PTD]}.%{myenv_inst,\d+}.%{fname,.*}.dat</file_pattern> 

We can then use that information in the target destination counter (to ensure unique counters for all files in all environments:

<target_counter>%{myenv_type}_%{myenv_inst}_%{fname}</target_counter>

And of course we use the same information to determine the suitable target directory:

<xfer_destination>/tmp/mydestdir/%{myenv_type}/%{myenv_inst}/%{fname}</xfer_destination>

Trying this altogether gives a sample configuration file:

<?xml version="1.0" standalone="yes"?> 
<FileEvent> 
  <settings> 
    <runhosts>bongo</runhosts> 
    <runusers>venture</runusers> 
    <local>*;maxfiles=10</local> 
    <db>%{ENV:FILE_EVENT_DB}</db> 
    <counters_db>%{ENV:FILE_EVENT_COUNTERS_DB}</counters_db> 
  </settings> 
  <event> 
    <send_which>newest</send_which> 
    <min_send_count>1</min_send_count> 
    <max_send_count>%{maxfiles}</max_send_count> 
    <min_age>0</min_age> 
    <description>xfer test transfer</description> 
    <directory>%{ENV:TESTFILES}</directory> 
    <file_pattern>%{myenv_type,[PTD]}.%{myenv_inst,\d+}.%{fname,.*}.dat</file_pattern> 
    <xfer_job_type>mv</xfer_job_type> 
    <target_counter>%{myenv_type}_%{myenv_inst}_%{fname}</target_counter> 
    <create_counter>true</create_counter> 
    <xfer_destination>/tmp/mydestdir/%{myenv_type}/%{myenv_inst}/%{fname}</xfer_destination> 
    <post_archive>false</post_archive> 
  </event> 
</FileEvent> 

Notice we are also making use of two additional global settings:


  • runhosts – provides a list of hosts where this can be run. Useful if the file is stored on shared storage and only particular hosts should make use of it.


  • Runusers – a list of usernames that should be able to run the configuration.


Once this is run it works just as you might example, verbose output being:

2013/08/25 23:08:14.113 0000000-I Events to load from configuratione file: 1 
2013/08/25 23:08:14.113 0000001-I Early directory pattern change for event #0: 
2013/08/25 23:08:14.113 0000002-I %{ENV:TESTFILES} => /home/venture/projects/SOURCE/fileevent/testing 
2013/08/25 23:08:14.119 0000003-I Event #0 [xfer test transfer] processing. 
2013/08/25 23:08:14.123 0000004-I Counters rename: /tmp/mydestdir/T/02/filename3 => /tmp/mydestdir/T/02/filename3.000003 + /tmp/mydestdir/T/02/filename3.000003.done 
2013/08/25 23:08:14.329 0000005-I Successfully sent '/home/venture/projects/SOURCE/fileevent/testing/T.02.filename3.dat'. 
2013/08/25 23:08:14.332 0000006-I Counters rename: /tmp/mydestdir/T/01/file2 => /tmp/mydestdir/T/01/file2.000003 + /tmp/mydestdir/T/01/file2.000003.done 
2013/08/25 23:08:14.438 0000007-I Successfully sent '/home/venture/projects/SOURCE/fileevent/testing/T.01.file2.dat'. 
2013/08/25 23:08:14.441 0000008-I Counters rename: /tmp/mydestdir/P/00/file1 => /tmp/mydestdir/P/00/file1.000003 + /tmp/mydestdir/P/00/file1.000003.done 
2013/08/25 23:08:14.539 0000009-I Successfully sent '/home/venture/projects/SOURCE/fileevent/testing/P.00.file1.dat'. 
2013/08/25 23:08:14.543 0000010-I Counters rename: /tmp/mydestdir/D/01/myfile => /tmp/mydestdir/D/01/myfile.000003 + /tmp/mydestdir/D/01/myfile.000003.done 
2013/08/25 23:08:14.762 0000011-I Successfully sent '/home/venture/projects/SOURCE/fileevent/testing/D.01.myfile.dat'. 

Notice that these files do not have dates, but date handling is just optional and if no date is present the modification time is used – though of course that is not really made use of in this example.

No comments:

Post a Comment