Can I use STL,iostream,new, delete in C/C++ for CUDA?
If you have a Fermi class GPU (so compute capability >=2.0), and are using CUDA 4.0 or later, then both new and delete are avialable for use in device code. STL containers and algorithms and iostream are not supported. 
If you want to use "STL like" operations with CUDA, you might be interested in the Thrust template library. It allows host code to transparently interact with the GPU using container types and implements a number of very useful data parallel primitives, like sorting, reduction, and scan. Note that this is still a host side apparatus, Thrust and its containers cannot be used inside your own kernel code.
Let's break this down some.
No, you can't use standard library code on the GPU (i.e. in your device-side code). The most direct obstacle is that the standard library is not targeting the CUDA compiler - not indicating its code should be compiled both for host-side and device-side execution. But even if this technical issue was dispensed with someone, there are various reasons why quite a bit of the standard library would not work as-is, or at all, on the GPU.
As talonmies suggests, the Thrust library provides some STL-like functionality, in a useful and nicely packaged way. But it's still mostly a "no" as an answer to your question since:
No, you can't use iostream in CUDA device-side code. We do have C-style printf, however: printf("my_int_value is %05d\n", my_int_value);. This is a very different beast than the standard library printf(), though, since it needs to send data across the PCI bus and have the driver get it to the host-side process' output stream.
See the CUDA Programming Guide's section on formatted output for details.
new and delete
The new and delete operators do work, similarly to on-device malloc() and free() - which is different than on the host-side and somewhat limited; see RobertCrovella's answer on this matter and the links in it.
I would advise, however, that you think very carefully about whether you really need to do on-device memory allocation and de-allocation; it's likely to be costly performance-wise, and often/usually you can do better by pre-allocating memory via a host-side API call.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With