Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug Terraform external providers with concurrency issues?

Tags:

terraform

I use Terraform 0.14.7 to deploy my infrastructure. Since I use AWS Lambda as a deploy target of my Whook API, I have a lot of scripts (7 different but one that is ran for all lambdas ~200) to retrieve the various informations about the API and each lambdas (see this example : https://github.com/nfroidure/whook/pull/108/files#diff-47625134d02e23a98ccff7918d11baa19c2ac409f4c90d76520b031b613b555cR74-R78).

To do so, I use the external provider that expects some JSON in return. My various commands indeed return JSON but the fact is that my terraform plan -out=terraform.plan failes with a message saying: "Error: command "env" produced invalid JSON: unexpected end of JSON input".

What I tried so far :

  • run TF_LOG=TRACE terraform plan -out=terraform.plan 2>&1: the plan is executed correctly. What I guess of that unexpected success is that in someway, there is a concurrency issue on some resource (file descriptor, memory ?) since the back-pressure of the fact to write heavy trace logs may reduce concurrency. That said, I'm not sure, maybe that the trace/debug mode reduce concurrency by itself?
  • run TF_LOG=TRACE terraform plan -out=terraform.plan 2>&1 | grep terraform-provider-external > out.txt: it reproduces the error, but, nothing abnormal into logs... not even the above error which seems legit. Maybe you have a grep in mind that would catch more infos while reducing logs volumetry to avoid the above mentioned back-pressure
  • create aliases for env to at least know which command was failing, but terraform seems to dislike aliases
  • log the JSON output of the command via a debug log service in the injector but everything ooks right.

I am currently in the process of creating external scripts using the tee command to have the output at the bash level but I suspect that it will take more file descriptors so not sure it ise the good way to go.

Except that, I really have no clue of how I could get rid of that problem, so I try here before creating an issue to get more verbose output for the external provider in Terraform.

Thanks for your help :)

Update: I managed to find out the bad JSON by doing so:

data "external" "lambdas" {
  program     = [
    "bash", "-ec",
    <<-EOF
    echo \"env NODE_ENV=${terraform.workspace} npx whook terraformValues --type='lambdas'\" >> lambdas-stderr.log
    env NODE_ENV=${terraform.workspace} npx whook terraformValues --type='lambdas' 2>> lambdas-stderr.log | tee lambdas-stdout.log
    EOF
  ]
  working_dir = ".."
}

It basically write stderr in a file and stdout in another while still proxying stdout to the command output.

With it I could iterate over all saved files and find out that the bad JSON were a JSON truncated at 65536 bytes (2^16) so I think there is some limitation somewhere. Will keep you up to date if I manage to find out what is limiting the JSON size.

To reproduce the problem everytime, I had to turn the parallelism option to 100. It think it could have a link to the pipe buffer size (see https://unix.stackexchange.com/questions/11946/how-big-is-the-pipe-buffer) and that in some conditions, the output read blocks and no backpressure mecanism takes this in count.

like image 769
nfroidure Avatar asked Dec 14 '25 21:12

nfroidure


1 Answers

I finally found out the problem. The JSON were truncated since the sub process were exiting without flushing the output primarily. See details here for NodeJS subprocesses : https://nodejs.org/api/process.html#process_process_exit_code

Backpressure was in effect so only the stdout buffer size was transmitted (64kb) and the rest of the JSON were truncated. Nothing to do with Terraform, finally.

like image 103
nfroidure Avatar answered Dec 16 '25 20:12

nfroidure



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!