I am trying to find any intersections of elements within a hash of arrays in Perl
For example
my %test = (
Lot1 => [ "A","B","C"],
Lot2 => [ "A","B","C"],
Lot3 => ["C"],
Lot4 => ["E","F"],
);
The result I would be after is
Lot1 and Lot2 have AB
Lot1,Lot2 and Lot3 have C
I think this could be done with a recursive function that effectively moves its way through the arrays and if an intersection between two arrays is found it calls itself recursively with the intersection found and the next array. The stopping condition would be running out of arrays.
Once the function is exited I would have to iterate through the hash to get the arrays that contain these values.
Does this sound like a good approach? I have been struggling with the code, but was going to use List::Compare to determine the intersection.
Thank you.
Array::Utils has an intersection operation where you can test the intersect of two arrays. But that's only the start point of what you're trying to do.
So I would be thinking that you need to first invert your lookup:
my %member_of;
foreach my $key ( keys %test ) {
foreach my $element ( @{$test{$key}} ) {
push ( @{$member_of{$element}}, $key );
}
}
print Dumper \%member_of;
Giving:
$VAR1 = {
'A' => [
'Lot1',
'Lot2'
],
'F' => [
'Lot4'
],
'B' => [
'Lot1',
'Lot2'
],
'E' => [
'Lot4'
],
'C' => [
'Lot1',
'Lot2',
'Lot3'
]
};
Then collapse that, into a key set:
my %new_set;
foreach my $element ( keys %member_of ) {
my $set = join( ",", @{ $member_of{$element} } );
push( @{ $new_set{$set} }, $element );
}
print Dumper \%new_set;
Giving:
$VAR1 = {
'Lot1,Lot2,Lot3' => [
'C'
],
'Lot1,Lot2' => [
'A',
'B'
],
'Lot4' => [
'E',
'F'
]
};
So overall:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my %test = (
Lot1 => [ "A", "B", "C" ],
Lot2 => [ "A", "B", "C" ],
Lot3 => ["C"],
Lot4 => [ "E", "F" ],
);
my %member_of;
foreach my $key ( sort keys %test ) {
foreach my $element ( @{ $test{$key} } ) {
push( @{ $member_of{$element} }, $key );
}
}
my %new_set;
foreach my $element ( sort keys %member_of ) {
my $set = join( ",", @{ $member_of{$element} } );
push( @{ $new_set{$set} }, $element );
}
foreach my $set ( sort keys %new_set ) {
print "$set contains: ", join( ",", @{ $new_set{$set} } ), "\n";
}
I don't think there's a more efficient way to tackle it, because you're comparing each array to each other array, and forming a new compound key out of it.
This gives you:
Lot1,Lot2 contains: A,B
Lot1,Lot2,Lot3 contains: C
Lot4 contains: E,F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With