Asynchronous programming does "grow" through the code base. It has been compared to a zombie virus. The best solution is to allow it to grow, but sometimes that's not possible.
I have written a few types in my Nito.AsyncEx library for dealing with a partially-asynchronous code base. There's no solution that works in every situation, though.
If you have a simple asynchronous method that doesn't need to synchronize back to its context, then you can use
var task = MyAsyncMethod();
var result = task.WaitAndUnwrapException();
You do not want to use
Task.Result because they wrap exceptions in
This solution is only appropriate if
MyAsyncMethod does not synchronize back to its context. In other words, every
MyAsyncMethod should end with
ConfigureAwait(false). This means it can't update any UI elements or access the ASP.NET request context.
MyAsyncMethod does need to synchronize back to its context, then you may be able to use
AsyncContext.RunTask to provide a nested context:
var result = AsyncContext.RunTask(MyAsyncMethod).Result;
*Update 4/14/2014: In more recent versions of the library the API is as follows:
var result = AsyncContext.Run(MyAsyncMethod);
(It's OK to use
Task.Result in this example because
RunTask will propagate
The reason you may need
AsyncContext.RunTask instead of
Task.WaitAndUnwrapException is because of a rather subtle deadlock possibility that happens on WinForms/WPF/SL/ASP.NET:
- A synchronous method calls an async method, obtaining a
- The synchronous method does a blocking wait on the
async method uses
Task cannot complete in this situation because it only completes when the
async method is finished; the
async method cannot complete because it is attempting to schedule its continuation to the
SynchronizationContext, and WinForms/WPF/SL/ASP.NET will not allow the continuation to run because the synchronous method is already running in that context.
This is one reason why it's a good idea to use
ConfigureAwait(false) within every
async method as much as possible.
AsyncContext.RunTask won't work in every scenario. For example, if the
async method awaits something that requires a UI event to complete, then you'll deadlock even with the nested context. In that case, you could start the
async method on the thread pool:
var task = Task.Run(async () => await MyAsyncMethod());
var result = task.WaitAndUnwrapException();
However, this solution requires a
MyAsyncMethod that will work in the thread pool context. So it can't update UI elements or access the ASP.NET request context. And in that case, you may as well add
ConfigureAwait(false) to its
await statements, and use solution A.
Update, 2019-05-01: The current "least-worst practices" are in an MSDN article here.
Thank you for sharing the PDF document. It helped us to determine that the problem you describe is not an iTextSharp problem. Instead it is a problem with the PDF document itself.
This problem doesn't have a solution, but I'm providing this answer to explain how you can discover for yourself that the problem also exists when iTextSharp isn't involved.
Open the document in Adobe Reader. Select the text "Muy señores nuestros" and copy/paste it into a text editor. You get "Muy señores nuestros". This is text that can be extracted using iTextSharp (it works correctly).
Now do the same with the text "GUARDIAN GLASS EXPRESS, S.L.". You get the following result: "". As you can see, you can not copy/paste the text correctly from Adobe Reader. This is due to the way the text is stored in the PDF. If you can not copy/paste the text from Adobe Reader, you should not expect to be able to extract the text using iTextSharp. The PDF is created in a way that doesn't allow extraction.
Please take a look at this video to find out some possible causes: https://www.youtube.com/watch?v=wxGEEv7ibHE
I'm sorry that it took so long to figure this out and that it turns out that you're asking something that isn't possible. Your question narrowed the problem down too much, as if the problem was caused by the "IDENTITY-H" encoding and iTextSharp. In reality, you're trying to extract text that can't be extracted.
If you look at the page dictionary inside the PDF, you'll find three font resources for the first (and only) page:
In the content stream (below) small red arrow, you see two strings (hexadecimal notation) that are shown using fonts referenced using the names
C2_1. Incidentally, these fonts are stored as composite fonts with
/SubType 0 and
/Encoding Identity-H. This means that the characters used in the hexadecimal string should correspond with the UNICODE values of the glyphs. If that's not the case, you're out of luck.
There seems to be no problem with the font for which the name
/TT0 is used.
The fact that
/TT0 uses WinAnsiEncoding and the other fonts use Identity-H is irrelevant. There are plenty of PDF files with fonts that use Identity-H of which the text can be copy/pasted or extracted using iTextSharp. Unfortunately, there is probably something wrong with the way your PDF was constructed. It would take too much time to analyze what went wrong, so your best shot is to contact the person who gave you the PDF and to ask him/her to fix the PDF.
How about this?
VB.NET doesn't use
for arrays, it uses