For a data entry project, a user can enter variables using a short-hand notation:
"Pour i1 into a flask."
"Warm the flask to 25 degrees C."
"Add 1 drop of i2 to the flask."
"Immediately seek cover."
In this case i1 and i2 are reference variables, where the number refers to an ingredient. The text strings are in the INSTRUCTION table, the ingredients the INGREDIENT table.
Each ingredient has a sequence number for sorting purposes.
Users may rearrange the ingredient order, which adversely changes the instructions. For example, the ingredient order might look as follows, initially:
seq | label
1 | water
2 | sodium
The user adds another ingredient:
seq | label
1 | water
2 | sodium
3 | francium
The user reorders the list:
seq | label
1 | water
2 | francium
3 | sodium
At this point, the following line is now incorrect:
"Add 1 drop of i2 to the flask."
The i2 must be renumbered (because ingredient #2 was moved to position #3) to point to the original reference variable:
"Add 1 drop of i3 to the flask."
This is a simplified version of the problem. The full problem can have lines such as:
"Add 1 drop of i2 to the o3 of i1."
Where o3 is an object (flask), and i1 and i2 are water and sodium, respectively.
The ingredient table is structured as follows:
id | seq | label
The instruction table is structured as follows:
step
The algorithm I have in mind:
steps that match the expression '\mi([0-9]+)':The algorithm may be incorrect as written. There could be two reference variables that must change. Consider before:
seq | label
1 | water
2 | sodium
3 | caesium
4 | francium
And after (swapping sodium and caesium):
seq | label
1 | water
2 | caesium
3 | sodium
4 | francium
Every i2 in every step must become i3; similarly i3 must become i2. So
"Add 1 drop of i2 to the flask, but absolutely do not add i3."
Becomes:
"Add 1 drop of i3 to the flask, but absolutely do not add i2."
The code to perform the first two parts of the algorithm resembles:
CREATE OR REPLACE FUNCTION
renumber_steps(
p_ingredient_id integer,
p_old_sequence integer,
p_new_sequence integer )
RETURNS void AS
$BODY$
DECLARE
v_tokens text[];
BEGIN
FOR v_tokens IN
SELECT
t.tokens
FROM (
SELECT
regexp_split_to_array( step, '\W' ) tokens,
regexp_matches( step, '\mi([0-9]+)' ) matches
FROM
instruction
) t
LOOP
RAISE NOTICE '%', v_tokens;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
What is a more efficient way to solve this problem (i.e., how would you eliminate the looping constructs), possibly leveraging PostgreSQL-specific features, without a major revision to the data model?
Thank you!
PostgreSQL 9.1.2.
You have to take care that you don't change ingredients and seq numbers back and forth. I introduce a temporary prefix for ingredients and negative numbers for seq for that purpose and exchange them for permanent values when all is done.
Could work like this:
CREATE OR REPLACE FUNCTION renumber_steps(_old int[], _new int[])
RETURNS void AS
$BODY$
DECLARE
_prefix CONSTANT text := ' i'; -- prefix, incl. leading space
_new_prefix CONSTANT text := ' ###'; -- temp prefix, incl. leading space
i int;
o text;
n text;
BEGIN
IF array_upper(_old,1) <> array_upper(_new,1) THEN
RAISE EXCEPTION 'Array length mismatch!';
END IF;
FOR i IN 1 .. array_upper(_old,1) LOOP
IF _old[i] <> _new[i] THEN
o := _prefix || _old[i] || ' '; -- leading and trailing blank!
-- new instruction are temporarily prefixed with new_marker
n := _new_prefix || _new[i] || ' ';
UPDATE instruction
SET step = replace(step, o, n) -- replace all instances
WHERE step ~~ ('%' || o || '%');
UPDATE ingredient
SET seq = _new[i] * -1 -- temporarily negative
WHERE seq = _old[i];
END IF;
END LOOP;
-- finally replace temp. prefix
UPDATE instruction
SET step = replace(step, _new_prefix, _prefix)
WHERE step ~~ ('%' || _new_prefix || '%');
-- .. and temp. negative seq numbers
UPDATE ingredient
SET seq = seq * -1
WHERE seq < 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE STRICT;
Call:
SELECT renumber_steps('{2,3,4}'::int[], '{4,3,2}'::int[]);
The algorithm requires ...
... that ingredients in the steps are delimited by spaces.
... that there are no permanent negative seq numbers.
_old and _new are ARRAYs of the old and new instruction.seq of ingredients that change position. The length of both arrays has to match, or an exception will be raised. It can contain seq that don't change. Nothing will happen to those.
Requires PostgreSQL 9.1 or later.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With