Asynchronous programming does "grow" through the code base. It has been compared to a zombie virus. The best solution is to allow it to grow, but sometimes that's not possible.
I have written a few types in my Nito.AsyncEx library for dealing with a partially-asynchronous code base. There's no solution that works in every situation, though.
Solution A
If you have a simple asynchronous method that doesn't need to synchronize back to its context, then you can use Task.WaitAndUnwrapException
:
var task = MyAsyncMethod();
var result = task.WaitAndUnwrapException();
You do not want to use Task.Wait
or Task.Result
because they wrap exceptions in AggregateException
.
This solution is only appropriate if MyAsyncMethod
does not synchronize back to its context. In other words, every await
in MyAsyncMethod
should end with ConfigureAwait(false)
. This means it can't update any UI elements or access the ASP.NET request context.
Solution B
If MyAsyncMethod
does need to synchronize back to its context, then you may be able to use AsyncContext.RunTask
to provide a nested context:
var result = AsyncContext.RunTask(MyAsyncMethod).Result;
*Update 4/14/2014: In more recent versions of the library the API is as follows:
var result = AsyncContext.Run(MyAsyncMethod);
(It's OK to use Task.Result
in this example because RunTask
will propagate Task
exceptions).
The reason you may need AsyncContext.RunTask
instead of Task.WaitAndUnwrapException
is because of a rather subtle deadlock possibility that happens on WinForms/WPF/SL/ASP.NET:
- A synchronous method calls an async method, obtaining a
Task
.
- The synchronous method does a blocking wait on the
Task
.
- The
async
method uses await
without ConfigureAwait
.
- The
Task
cannot complete in this situation because it only completes when the async
method is finished; the async
method cannot complete because it is attempting to schedule its continuation to the SynchronizationContext
, and WinForms/WPF/SL/ASP.NET will not allow the continuation to run because the synchronous method is already running in that context.
This is one reason why it's a good idea to use ConfigureAwait(false)
within every async
method as much as possible.
Solution C
AsyncContext.RunTask
won't work in every scenario. For example, if the async
method awaits something that requires a UI event to complete, then you'll deadlock even with the nested context. In that case, you could start the async
method on the thread pool:
var task = Task.Run(async () => await MyAsyncMethod());
var result = task.WaitAndUnwrapException();
However, this solution requires a MyAsyncMethod
that will work in the thread pool context. So it can't update UI elements or access the ASP.NET request context. And in that case, you may as well add ConfigureAwait(false)
to its await
statements, and use solution A.
Update, 2019-05-01: The current "least-worst practices" are in an MSDN article here.
Thank you for sharing the PDF document. It helped us to determine that the problem you describe is not an iTextSharp problem. Instead it is a problem with the PDF document itself.
This problem doesn't have a solution, but I'm providing this answer to explain how you can discover for yourself that the problem also exists when iTextSharp isn't involved.
Open the document in Adobe Reader. Select the text "Muy señores nuestros" and copy/paste it into a text editor. You get "Muy señores nuestros". This is text that can be extracted using iTextSharp (it works correctly).
Now do the same with the text "GUARDIAN GLASS EXPRESS, S.L.". You get the following result: "". As you can see, you can not copy/paste the text correctly from Adobe Reader. This is due to the way the text is stored in the PDF. If you can not copy/paste the text from Adobe Reader, you should not expect to be able to extract the text using iTextSharp. The PDF is created in a way that doesn't allow extraction.
Please take a look at this video to find out some possible causes: https://www.youtube.com/watch?v=wxGEEv7ibHE
I'm sorry that it took so long to figure this out and that it turns out that you're asking something that isn't possible. Your question narrowed the problem down too much, as if the problem was caused by the "IDENTITY-H" encoding and iTextSharp. In reality, you're trying to extract text that can't be extracted.
If you look at the page dictionary inside the PDF, you'll find three font resources for the first (and only) page:

In the content stream (below) small red arrow, you see two strings (hexadecimal notation) that are shown using fonts referenced using the names C2_0
and C2_1
. Incidentally, these fonts are stored as composite fonts with /SubType
0 and /Encoding
Identity-H. This means that the characters used in the hexadecimal string should correspond with the UNICODE values of the glyphs. If that's not the case, you're out of luck.
There seems to be no problem with the font for which the name /TT0
is used.
The fact that /TT0
uses WinAnsiEncoding and the other fonts use Identity-H is irrelevant. There are plenty of PDF files with fonts that use Identity-H of which the text can be copy/pasted or extracted using iTextSharp. Unfortunately, there is probably something wrong with the way your PDF was constructed. It would take too much time to analyze what went wrong, so your best shot is to contact the person who gave you the PDF and to ask him/her to fix the PDF.
Best Solution
How about this?
VB.NET doesn't use
[]
for arrays, it uses()
instead.