I'd like to obfuscate data in specific columns in postgres 9.1.
For instance, I want to give all the people a 'random' first and last name.
I can generate a pool of names to use:
select name_first into first_names from people order by random() limit 500;
select name_last into last_names from people order by random() limit 500;
Both of those queries run in about 400ms (which works fine for me, assuming they only need to run once!)
Using a regular update statement doesn't work - this just does each select once, thus gives all the people the same name:
update people
SET name_last=(SELECT * from last_names order by random() limit 1),
name_first=(SELECT * from first_names order by random() limit 1)
where business_id=1;
How can I give each person a randomized name in postgres? I really don't want to do this in Ruby on Rails - I assume a pure SQL approach will be faster. However, speed isn't too much of a concern as I literally have all night for this business case.
-- Invent some data
CREATE TABLE persons
( id SERIAL NOT NULL PRIMARY KEY
, last_name varchar
);
INSERT INTO persons(last_name)
SELECT 'Name_' || gs::text
FROM generate_series(1,10) gs
;
-- The update
WITH swp AS (
SELECT last_name AS new_last_name
, rank() OVER (ORDER BY random() ) AS new_id
FROM persons
)
UPDATE persons dst
SET last_name = swp.new_last_name
FROM swp
WHERE swp.new_id = dst.id
-- redundant condition: avoid updating with same value
AND swp.new_last_name <> dst.last_name
;
SELECT * FROM persons
;
RESULT:
id | last_name
----+-----------
1 | Name_6
2 | Name_4
3 | Name_8
4 | Name_2
5 | Name_1
6 | Name_10
7 | Name_5
8 | Name_7
9 | Name_3
10 | Name_9
(10 rows)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With