I've bumped into a very complicated problem (in my perspective as a newbie) and I'm not sure how to solve it. I can think of the workflow but not the script.
I have file A that looks like the following: Teacher (tab) Student1(space)Student2(space)..
Fiona Nicole Sherry
James Alan Nicole
Michelle Crystal
Racheal Bobby Dan Nicole
They sometimes have numbers right next to their names when there are two of the same name (ex, John1, John2). Students may also overlap if they have more than two advisors..
File B is a file that has groups of teachers together. It looks similar but the values are comma-delimited.
Fiona Racheal,Jack
Michelle Racheal
Racheal Fiona,Michelle
Jack Fiona
The trend in file B is that a key has multiple values and each value becomes a key as well to easily find who is grouped with who.
The output I would like is which students will be likely to receive similar education based on their teacher/groups.So I would like the script to do the following:
Open file B, go through each teacher to see if they have students (some may not, the actual list is quite big..). So if I take the first teacher, Fiona, it will look in stored file A hash table to see if there is a Fiona. If there is, (in this case, Nicole and Sherry), pop them each as new keys to a new hash table.
while (<Group>) {
chomp;
$data=$_;
$data=~/^(\S+)\s+(.*)$/;
$TeacherA=$1;
$group=$2;
Then, look at the group of teachers who are grouped with Fiona (Racheal, Jack). Take 1 person at a time (Racheal)
if (defined??) {
while ($list=~/(\w+)(.*)/) {
$TeacherB=$1;
$group=$2;
Print student-student and teacher-teacher group.
Nicole Bobby,Dan,Nicole Fiona Racheal
Sherry Bobby,Dan,Nicole Fiona Racheal
Since the next teacher in Fiona's group, Jack, didn't have any students, he would not be in this results. If he had, for example, David, the results would be:
Nicole Bobby,Dan,Nicole Fiona Racheal
Sherry Bobby,Dan,Nicole Fiona Racheal
Nicole David Fiona Jack
Sherry David Fiona Jack
I'm so sorry for asking such a complicated and specific question. I hope other people who are doing something like this by any chance may benefit from the answers. Thank you so much for your help and reply. You are my only source of help.
This is a rather strange way to look at the data, but I think I got it to work the way you tried. It would be interesting to see why you want the data to be that way. Maybe provide column headings next time. Knowing why you do something in a certain way often makes it a lot easier to think of ways to achive it imo.
So here's what I did. Don't get confused, I put your values from file A and file B into scalars and changed the part about reading them.
my $file_a = qq~Fiona\tNicole Sherry
James\tAlan Nicole
Michelle\tCrystal
Racheal\tBobby Dan Nicole
~;
my $file_b = qq~Fiona\tRacheal,Jack
Michelle\tRacheal
Racheal\tFiona,Michelle
Jack\tFiona
~;
After that, proceed to read the 'files'.
# 1: Store file A in a hash
my (%file_a);
foreach my $a (split /\n/, $file_a) {
my @temp = split /\t/, $a;
$file_a{$temp[0]} = $temp[1];
}
# 2: Go through file B
foreach my $b (split /\n/, $file_b) {
my @line_b = split /\t/, $b;
# Look in stored file A if the teacher is there
if (exists $file_a{$line_b[0]}) {
my (%new_hash_table, @teachers);
# Put all the students of this teacher into a new hash
$new_hash_table{$_} = '' foreach split / /, $file_a{$line_b[0]};
# 3: Take one of the group of teachers who are grouped with the
# current teacher at a time
foreach my $teacher (split /,/, $line_b[1]) {
if (exists $file_a{$teacher}) {
# 4: This teacher from the group has students listen in file A
push @teachers, $teacher; # Store the teacher's name for print later
foreach (keys %new_hash_table) {
# 5: Fill the students as csv for the student keys from step 2
$new_hash_table{$_} = join(',', split(/ /, $file_a{$teacher}));
}
}
}
foreach my $student (keys %new_hash_table) {
# 6: Print...
print join("\t",
# Student-student relation
$student, $new_hash_table{$student},
# Teacher-teacher relation
$line_b[0], @teachers);
print "\n";
}
}
}
For me that provides the following output:
Sherry Bobby,Dan,Nicole Fiona Racheal
Nicole Bobby,Dan,Nicole Fiona Racheal
Crystal Bobby,Dan,Nicole Michelle Racheal
Bobby Crystal Racheal Fiona Michelle
Nicole Crystal Racheal Fiona Michelle
Dan Crystal Racheal Fiona Michelle
This is probably weird since I don't have all the values.
Anyways, there are a few things to be said to this.
In your example code you used a regex like $data=~/^(\S+)\s+(.*)$/; to get to the values of a simple two-column list. It is a lot easier to use the split operator to do that.
When you read from a file with the <FILEHANDLE> syntax, you can put the scalar you want your lines to go into in the while loop's condition like so:
while (my $data = <GROUP>) {
chomp $data
Also it is common to write filehandle names in all-caps.
I'd suggest you take a look at the 'Learning Perl'. The basic concepts of hashes and arrays in there should be enough to takle tasks like this one. Hope this helps.
I can't imagine why you would want this redundant data when you could just look at file A to get a good idea of who was getting a similar education ... but here is a way of doing it in perl all the same.
$data = {};
# pull in students
open(IN, "students.txt");
while(my $line = <IN>) {
chomp($line);
my ($teacher, @students) = split(/\s+/,$line);
$data->{$teacher}->{students} = \@students;
}
close IN;
# pull in teachers
open(IN, "teachers.txt");
while(my $line = <IN>) {
chomp($line);
my ($teacher, $supporters) = split(/\s+/,$line);
my @supporters = split(/,/,$supporters);
$data->{$teacher}->{supporters} = \@supporters;
}
close IN;
# make the output
foreach my $teacher (keys %{$data}){
foreach my $teacher_student (@{$data->{$teacher}->{students}}) {
foreach my $supporter (@{$data->{$teacher}->{supporters}}){
my $num_supporter_students = @{$data->{$supporter}->{students}} + 0;
if($num_supporter_students) {
print "$teacher_student\t" .
join(",",@{$data->{$supporter}->{students}}) .
"\t$teacher\t$supporter\n";
}
}
}
}
When run on the data listed in the question it returns:
Crystal Bobby,Dan,Nicole Michelle Racheal
Nicole Bobby,Dan,Nicole Fiona Racheal
Sherry Bobby,Dan,Nicole Fiona Racheal
Bobby Nicole,Sherry Racheal Fiona
Bobby Crystal Racheal Michelle
Dan Nicole,Sherry Racheal Fiona
Dan Crystal Racheal Michelle
Nicole Nicole,Sherry Racheal Fiona
Nicole Crystal Racheal Michelle
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With